Scaling Large Projects With Erlang
Delchanat points out a blog entry which notes,
"The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."
They were right!
Perhaps the systems would be better running efficiently rather than sufficiently?
Perlang
"you need large real-time systems running as sufficiently as possible."
Should that not be efficiently as possible?
The Revolution Will Not Be Televised
I mean, It's Sufficiently Concurrent!
...wait, what?
Buzzwords that matter.
"The two biggest computing providers of today"?
What the hell does that mean?
Also, is it just me or does the article intro sound like it was written by someone who has taken way too many marketing classes?
Eh... Moore's Law says that we'll all be on the same processor in a couple years anyway.
TFA states "Because I don't want to be hooked into the (proprietary) Google stack (Python, Django, BigTable, GoogleOS) just yet" ... IMHO neither Python nor Django are proprietary. Or even proprietary in a way that the AWS stack is not?
"running as sufficiently as possible"?
Sometimes as a nation we must ask ourselves, is our children learning?
You see? You see? Your stupid minds! Stupid! Stupid!
People may also want to check out Scala at:
http://www.scala-lang.org/
It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.
1. Multicore ready.
Erlang will use them. Write your application in Erlang and it's done for you.
2. Scales well.
As an example, http://yaws.hyber.org/ scales very nicely when loads increase. Your basic LAMP/LYMP setup runs much better on vanilla hardware.
3. Designed for telecom
The architects designed the language to run in a telecom environment so things like upgrades can be done while the application is running.
Yaws in particular needs your help. Failover clustering inside the yaws server would be wonderful. Right now, it uses CGI to process other languages. It does it flawlessly, but a more direct solution might be a nice project.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Well, are they?
Man this thread seems to have had a can of *whoosh* unleashed upon it. GP was quoting a GWB speech.
I think the summary (and article) are somewhat poorly written, but that doesn't shadow the fact that functional languages are becoming more and more interesting these days with concurrency becoming so important.
I'd like to learn one, but there are several out there.. What I'd like to see is a good in-depth comparison of different concurrent functional languages: why would I choose Haskell, or Erlang, or OCaml, for example? Are they all interpreted? (Does one exist that compiles?) Which ones support concurrency? What language features do they boast, and what are the advantages and disadvantages of these features? Do they have a complete set of libraries?
Anyone know of an article like this? I've been searching for a while. Every article on functional languages I've found seems to concentrate on a particular one, but I can't find something helping me decide which one is most worth learning.
>The two biggest computing-providers of today, Amazon as well as Google, are building their concurrent offerings on top of really concurrent programming languages and systems
Google is largely a C++ company, a language that doesn't include explicit support for concurrency (although the next version, C++0x, will).
They mention erlang only being used in a relatively small project that most of google's own software doesn't support yet.
Note, that google gears is used in the excellent google reader software (although not much else).
If you want to build computing into a utility, you need large real-time systems running as sufficiently as possible.
But if you want to build sprockets into a weasel you need small batch-mode systems running as necessarily as possible.
If the poster had anything interesting to say (I'd guess not, but who knows!), it was totally obscured by his lack of grasp of the English language.
Oops, wrong link. I isn't learning.
http://www.youtube.com/watch?v=aAUrToY33tI
TFA implicitely states that Google is using Erlang or some concurrent programming languages. That's wrong: they use C++, Java and Python, and prett much nothing else (apart for specialized stuff for MacOS apps for instance).
Given that this statement appears almost halfway through the blog post, I would say that it was already too late for that.
'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
3. Not Unicode-ready.
Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.
Don't thank God, thank a doctor!
Wow, it's not often I strongly criticise articles around here, but that was total garbage.
For the smart ones that didn't RTFA, here's a quick summary:
For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.
The article says this:
But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.
Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.
However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.
I started getting interested in Erlang a year ago. I bought Joe Armstrong's book about it, and, when I was in a place for a few weeks with no internet connectivity, I settled down with my laptop to learn it.
However, I got stuck a: By the lack of documentation (perhaps I could have downloaded it, but over a mobile phone it would have been painful), and b: one of the tutorials in his book seemed to include a library that he had written, to which there was no source code, and the others then built upon that, which scuppered me.
I got dispirited, but I'm still interested. I guess I'll end up learning it when I need to use it for something (like I've learned everything in life). Anyone know where the equivalent of Javadocs are for the core Erlang libraries?
Get your own free personal location tracker
Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires.
Well, except that it's darned inconvenient to actually write the applications in it.
Google Gears is using Erlang-style concurrency, and the list goes on."
Yup, and it makes more sense to add "Erlang-style concurrency" to existing languages than to throw out everything and switch to Erlang.
http://www.stackless.com/
I will usually bend over backward to use Python just because I find it very easy to write self documenting code. I have to maintain my own code and find it easy to work with code that I wrote years ago. In fact, my cow-irkers (who don't use Python at all) can easily follow what I've done. The language is sane enough that I can get help from C++ people if I get into a thinko. (a thinko is like a typo but way way worse) It also means that I can still help people with their C and C++ problems even though I have hardly written a line in those languages in the last couple of years.
If I had to write something concurrent, I would try Stackless Python. I looked at the mailing list archive and the project is active (so you can get help) and there are major projects that use it. It looks like it is worth checking out. (Obviously, I haven't used it yet)
Andy
The enthusiasm for "cloud computing" may evaporate when Xmas rolls around.
I went to a talk at Stanford by the architect of Amazon's web services. It came out in questioning that the real motivation between Amazon's low-priced web services is that their load in the Xmas shopping season is about 4x the load for the rest of the year. Their infrastructure is sized for the November-December peak, so for ten months of the year they have vast excess capacity. That's why Amazon's web services are so cheap.
Don't expect good response time during the shopping season. Although this Xmas might be OK, due to the recession.
It's a functional language you dolt, of course the "variables" are invariant. The only way in which it was the designers preference is that he presumably wanted all the niceties that come with declarative languages; provability, implicit parallelism, etc.
1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
Why is any syntax that's not filched from C weird? Frankly I'm not that fond of C's syntax it _can_ lead to very unreadable code (Speaking as someone with 20 years experience with C). If you'd done a modicum of research you'd realise that Erlang models its syntax on Prolog, and Prolog like Lisp has very regular syntax.
Note that I'm not claiming that Erlang, or Prolog, or Lisp syntax is that great either, but by holding up C as some kind of gold standard you've automatically lost that argument.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
Not sure about Unicode, you may actually have made a valid point here.
3. Not Unicode-ready.
Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.
Joe Armstrong's Erlang textbook is interesting, but I did not have time to learn the language and recode the part of our current project that would benefit from it. So I did what any sensible person would do: raided the concepts, and used them to redesign the critical parts of the application. I was initially provoked into doing this because, in the book, the comparison at one point between the Erlang and Java way of doing something is just plain wrong. When I thought out how I would actually do it in Java, I realised that it helps to stick to a language you know well.
From scarped cliff or quarried stone she cries "A thousand types are gone, I care for nothing, no not one."
This article is discussing the front-facing programmatic interface to cloud computing. It is not referring to the internal code of your company.
They built the Facebook Chat backend using Erlang. Scaling something from 0 users one day to tens of millions of active users the next day is a challenge, and they decided Erlang was the right tool for the job. http://www.facebook.com/note.php?note_id=14218138919&id=9445547199
>> you need large real-time systems running as sufficiently as possible.
lol. Are they running vista then? oh wait.. that would be INsufficient...
TFA more or less says that IMDB is switching from Perl to Erlang. So I looked at the link and here's what I got:
(From here
We are looking for developers with experience building web scale distributed systems. We are currently working in Perl but have plans to use Java, Erlang and any other language that we think will suit our purposes. We aren't looking for expertise in any of those, particularly, but we expect that you will be an expert in the systems you know. We do require that you be passionate about testing (unit, integration, fault-injection) and code quality. Experience with relational databases (Oracle, MySQL, etc), embedded databases (BerkeleyDB, CDB, MonetDB, etc) and Linux are a big plus.
I'll leave anyone to draw his own conclusions.
Oh no! What's to become of poor pudge?? Will he rather fight than switch? Or will he roll over like he does for his favorite political thugs?
I don't get it. Web transactions are generally like little programs. Concurrency is only needed when sharing info, which is what the database is used for. They need a big-ass database, not a big-ass language unless they are doing online games or the like. Are they trying to re-invent a navigational RAM database or something? Something doesn't add up.
Table-ized A.I.
http://www.mozart-oz.org/
I'll just cite another "competitor":
"The Mozart Programming System is an advanced development platform for intelligent, distributed applications. The system is the result of a decade of research in programming language design and implementation, constraint-based inference, distributed computing, and human-computer interfaces. As a result, Mozart is unequalled in expressive power and functionality. Mozart has an interactive incremental development environment and a production-quality implementation for Unix and Windows platforms. Mozart is the fruit of an ongoing research collaboration by the Mozart Consortium.
Mozart is based on the Oz language, which supports declarative programming, object-oriented programming, constraint programming, and concurrency as part of a coherent whole. For distribution, Mozart provides a true network transparent implementation with support for network awareness, openness, and fault tolerance. Mozart supports multi-core programming with its network transparent distribution and is an ideal platform for both general-purpose distributed applications as well as for hard problems requiring sophisticated optimization and inferencing abilities. We have developed many applications including sophisticated collaborative tools, multi-agent systems, and digital assistants, as well as applications in natural language understanding and knowledge representation, in scheduling and time-tabling, and in placement and configuration."
Main difference between the BSD license and the GPL license: one is from California and the other is from Massachusetts
I read that as "If you want to build computing into a utility, you need large real-time systems running as self sufficiently as possible."
You know, other than the spam blogs that just cut and paste the front of Slashdot, I can't find any reference to back this statement up. Anyone have a link that isn't a spam blog?
I can tell you that NO development that is outside of core network componens (I have only limited insight there) is done in Erlang. Not the billing systems, the messaging components (SMS/MMS/Voicemail), not the IVRs. But sure. Whatever. If you really want to build an old school exchange then by all means, get your SS7 stack and Erlang away. Unfortunately the movements towards IMS aren't really helping Erlang internally. And neither is LTE.
I've had a wonderful time, but this wasn't it -- Groucho Marx
People are turning to currency-friendly languages as kind of a fad. A solid need has not been identified for it yet except maybe action gaming. Outside of that, a smart use of the database takes care of most if not all of the alleged "concurrency problems" that are being generated by new chip sets.
I agree that database architecture may need an adjustment to take advantage of more RAM, but that shouldn't require a significant change in the way applications are written. DB's already insulate most apps from implementation issues including concurrency. I'm not saying that concurrency app languages don't have their use, but lets make sure there is really a problem before we make a mad dash to claimed solutions. The IT industry has a tendency to get carried away with fake or exaggerated problems.
Research is fine, but lets approach actual implementation smartly. Maybe I've been around too long, but this smells like yet another fad.
Table-ized A.I.
It should be noted that Facebook's relatively new chat feature, which allows Facebook users to send instant messages to all their online friends as well as see status changes, notifications, and feed stories in near real time, was developed using Erlang. http://www.planeterlang.org/story.php?title=Facebook_chat_is_developed_in_Erlang
Um, have you never heard of functional programming? Erlang's a pure functional language, a design choice that opens up a number of optimization possibilities. The most obvious one is that the fact that since programs can't mutate memory, there is no memory synchronization or locking required. There are other advantages to it, too: operations on pure functional data structures don't destroy older states of the data structure, meaning that no thread ever sees a shared data structure in an inconsistent state.
To tell you the truth, I'd prefer a language where mutable memory is optional (e.g., O'Caml), but I don't think that Erlang's design choice is by any means stupid. I do think the syntax is funny--it was largely copied from Prolog and I don't like Prolog's syntax--but come on, this is also a minor point. The only good point you've brought up is the awful support for Unicode.
Are you adequate?
It's very misleading to start talking about message passing and no-shared-memory as "Erlang-style concurrency". There is a huge history of languages and frameworks in this area that predate Erlang by decades. It does no credit to Erlang when its advocates ignore this.
Ob. dogfood.
Sounds like the beginning of a Nigerian scam mail.
"The two biggest computing providers of today are offering to you opportunities to become financially enriched by building their concurrent offerings on top of really concurrent programming languages and systems. All you have to profit is send $20,000 via Western Union to Ubulu@lagosnet.ng...
This is really a duplicate of previous posts and sales ploy. A hard sell of Erlang.
Me, I never plan on learning or using it. There are plenty of good languages out there without another. But what really turns me off it their hyperbole on threading and multiprocessing. C, C++ and Java thread nicely thank you. They work across platforms, have standards, relatively mature and run on big and small. C and C++ are also fast. Using an ORB or RPCs isn't that tough.
The limitation today in tapping multiprocessing is the lack of good designs by the people. Nothing else. Those who do not believe this ought to read up on RPCs, semaphores, mutex and the like. Or just do Java. However Java does show not knowing the fundamentals can often result in big fat inefficient bloatware.
Real time is also by design and not the specific language but tends towards C. It is why device drivers are written in C. C can be real time and can be made very lean and fast. vxWorks, pSOS and there is even a Linux RTOS out there, all are in C.
I invested in the 80's in learning C, and shortly after C++. It has no limitations in applied computing. Limitations are purely the carbon based units design.
It seems when something new is developed everyone thinks that this will solve all the problems all the other languages didn't. First it was ruby on rails and now we have erlang, guess what it wont.
http://wiki.reia-lang.org/wiki/Main_Page
Apache CouchDB, an Incubator project, uses Erlang. It is a document-oriented database with MapReduce views/indicies support. Its documents are stored as JSON objects and its MapReduce functions are typically written in JavaScript.
http://pixelcort.com/