Forget Moore's Law?
Roland Piquepaille writes "On a day where CNET News.com releases a story named "Moore's Law to roll on for another decade," it's refreshing to look at another view. Michael S. Malone says we should forget Moore's law, not because it isn't true, but mainly because it has become dangerous. "An extraordinary announcement was made a couple of months ago, one that may mark a turning point in the high-tech story. It was a statement by Eric Schmidt, CEO of Google. His words were both simple and devastating: when asked how the 64-bit Itanium, the new megaprocessor from Intel and Hewlett-Packard, would affect Google, Mr. Schmidt replied that it wouldn't. Google had no intention of buying the superchip. Rather, he said, the company intends to build its future servers with smaller, cheaper processors." Check this column for other statements by Marc Andreessen or Gordon Moore himself. If you have time, read the long Red Herring article for other interesting thoughts."
BBC Article on the same story here.
Google supports thousands of user request sessions, not one huge straight-line serial command sequence. This means that a huge bunch of smaller servers will do the jobb quicker than a big super-server. Not only because of the raw computing power, but due to the parallellalism that is extracted by doing so and the loss of overhead introduced by running too many tasks on one server.
google doesn't really do much in terms of actually hardcore processing - it just takes in a LOT of requests - but each one isn't intense, and it is short lived.
On the other hand, say you are running a renderfarm - in that case you want a fast distributed network, the same way google does, but you also want each individual node as fast as freakin possible.
They have been using Alphas for a long time for that exact reason - so now with the advent of the Intel/AMD 64s, that will drive prices down on all of it - so I would imagine the render farms are quite happy about that. That means that they can either stay at the speed at which they do things now, but for cheaper - or they can spend what they do now and get much more done in the same time... either way leading to faster production and argueably more profit.
The clusters that I am most familiar with are somewhere in between - they don't need the newest fastest thing, but they certainly wouldn't be hurt by a faster processor.
For the stuff I do though, it doesn't matter too much - if I have 20 hours or so to process something, and I have the choice of doing it in 4 minutes or 1 minute, I will take whichever is cheaper since the end result might as well be the same otherwise in my eyes.
There are some odd things afoot now, in the Villa Straylight.
The NoW (Network of Workstations) approach has been on ongoing trend over the last few years as the throughput achieved by an N distinct processors connected by a high speed network is nearly as good (and sometimes better) than an N processor mainframe. All this comes at a cost that is much less than that of a mainframe. In Google's case, it is the volume that is the problem, and not necessarily the complexity of the tasks presented. Thus, Google (and many other companies) can string together a whole bunch of individual servers (each with their own memory and disk space so there is no memory contention - another advantage over the mainframe approach) quite (relatively) cheaply and get the job done by load balancing across the available servers. Replacement and upgrades - yes, eventually to the 64 chips - can be done iteratively so as to not impact service, etc. Lots of advantages...
Here is a link to a seminal paper on the issue if you are interested:
http://citeseer.nj.nec.com/anderson94case.html
From the article:
He gave the Monday keynote at the "Hot Chips" conference at Stanford last August.
There is an abstract of his keynote.
No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
a one of matrix inversion. well parts of it can't be done efficiently in parallel.
Though the resulting matrix would probably be applied accross a lot of data and that can be done in parallel.
A matrix inversion can be done very fast if you have a Very MPP system (say effectivly 2^32 processors!) like a quantum computer.
thank God the internet isn't a human right.
Matrix inversion comes to mind -- it is very difficult to parallelize.
s /s dsc.pdf
I found a nice little read about how to decide if any particular problem you are looking at is easily parallelizable.
It is in pdf (looks like a power point presentation).
http://cs.oregonstate.edu/~pancake/presentation
Free Online Dark Fantasy RPG - http://www.blackmud.com
It's worth noting that the Earth Simulator is actually a cluster of vector mainframes (NEC SX-6s) using a custom interconnect. You could do something similar with the Cray X-1 if you had US$400M or so to spend.
If you're referring to the article I think you are, it was specifically talking in the context of weather simulation -- an application area where vector systems are known to excel (hence why the Earth Simulator does so well at it). The problem is that vector systems aren't always as cost-effective as clusters for a highly heterogeneous workload. With vector systems, a good deal of the cost is in the memory subsystem (often capable of several 10s of GB/s in memory bandwidth), but not every application needs heavy-duty memory bandwidth. Where I work, we've got benchmarks that show a cluster of Itanium-2 systems wiping the walls with a vector machine for some applications (specifically structural dynamics and some types of quantum chemistry calcuations), and others where a bunch of cheap AMDs beat everything in sight (on some bioinformatics stuff). It all depends on what your workload is.
"My life's work has been to prompt others... and be forgotten." --Cyrano de Bergerac
Computing the MD5 sum of 1TB of data. :-) MD5 depends on (among other things) being non-parallelizable for its security.
the standard search algorithm for chess play is something called minimax search with alpha-beta pruning.
This algorithm is something I'm familiar with. (Not chess, but other toy games in LISP, like Tic Tac Toe, Checkers, and Reversi, all of which I've implemented using a generic minimax-alphabeta subroutine I wrote.) (All just for fun, of course.)
If you have a bunch of parallel nodes, you throw all of the leaf nodes at it in parallel. As soon as leaf board scores start comming in, you min or max them up the tree. You may be able to alpha-beta prune off entire subtrees. Yes, at higher levels, the process is still sequential. But many boards' scores at the leaf nodes need computed, and could be done in parallel. Yes, you may alpha-beta prune off a subtree that has already had some of your processors thrown at it's leaf nodes -- you abort those computations and re-assign those processors to the leaf nodes that come after the subtree that just got pruned off.
Am I missing anything important here? It seems like you could still significantly benefit from massive parallel processing. If you have enough processors, the alpha-beta pruning itself might not even be necessary. After all the alpha-beta pruning is just an optmization so that sequential processing doesn't have to examine subtrees that wouldn't end up affecting the outcome. But let's say, each board can have 10 possible moves made by each player. I want to look 4 moves ahead. This is 10,000 leaf boards to score. If I have more than 10,000 processors, why even bother to alpha-beta prune? Now, if I end up needing to examine 1 million boards (more realistic perhaps) and I can do them 10,000 at a time, I still may end up being able to take advantage of some alpha-beta pruning. And 10,000 boards examined at once, sequentially, is still faster than 1 at a time.
Vector processors wouldn't be any more helpful here (would they?) than massively parallel?
Of course, whether a mere 10,000 processors constitutes massively parallel or not is a matter of interpretation. Some people say a 4-way SMP is massively parallel. I suppose it depends on your definition of "massively".
The price of freedom is eternal litigation.