Slashdot Mirror


Supercruncher Applications

starheight writes "Bill McColl has written an article contrasting traditional massively parallel supercomputing with a whole new generation of compute-intensive apps that require massively scalable architectures and can deliver both incredible throughput and real-time responsivenes when processing millions or billions of tasks."

58 comments

  1. Yay!!! by nixkuroi · · Score: 5, Funny

    Just in time for Vista!

    1. Re:Yay!!! by vindimy · · Score: 1

      Wait! But does it run Linux???

  2. Example App: by HugePedlar · · Score: 3, Funny

    Dell's website consumer pricing generator.

    --
    Argh.
  3. Wow by Sneakernets · · Score: 5, Funny

    Imagine a Beowulf cluster of-- oh. Wait.

    --
    "No freeman shall ever be debarred the use of arms." -- Thomas Jefferson
  4. The real question.. by Lithdren · · Score: 1

    How many hours does it take vista to boot on this thing?

    1. Re:The real question.. by Anonymous Coward · · Score: 0

      Answer:
      Error; Overflow

    2. Re:The real question.. by nixkuroi · · Score: 1

      Haha...oddly enough, Vista boots faster than my old XP install and aside from some video drivers not yet being released for Vista (What?? I have to run Second Life on my MacBook Pro for the time being?), it's really a pretty decent experience.

      Modders, feel free to mod me off topic :)

  5. Massively parallel?! by Anonymous Coward · · Score: 0, Funny

    GAHH!

    There is no such thing as "massively parallel!" It makes no sense! Parallel in qualitative, NOT quantitative! Things are either parallel or they're not, there are no degrees of "parallelness!"

    The same applies to "massively multiplayer!" It's no wonder that people can't grasp basic logic when they insist on talking like this!

    Shouldn't computer savvy folk notice these sorts of things?!

    1. Re:Massively parallel?! by ReidMaynard · · Score: 1

      With his excessive use of "massively" this is obviously a ginormous supercruncher.

      --
      -- www.globaltics.net

      Political discussion for a new world

    2. Re:Massively parallel?! by Lord+Crc · · Score: 4, Insightful

      There is no such thing as "massively parallel!" It makes no sense! Parallel in qualitative, NOT quantitative! Things are either parallel or they're not, there are no degrees of "parallelness!"

      Sure there are. Say you want to find the maximum of 4 integers. You can do that in parallel, but you won't gain much if you have more than two processors (or execution units). Contrast this with say rendering an image using a path tracer, where each ray is independent of each other. First problem is hard to scale up, second one isn't. I'd say that means that ray tracing is a "more parallel" task.

      Also, writing algorithms that has to run on 10000 processors efficiently is not exactly the same as one that has to run on 4 processors, in the same way that writing a multiplayer game that handles four players isn't the same as writing one that can handle thousands of concurrent players. So they toss on the "massive" part to separate the cases. At least that's my take on it.

    3. Re:Massively parallel?! by Anpheus · · Score: 2, Informative

      Actually, massively parallel has a meaning. For example, the 131,072 CPU beast designed by IBM. This computer is designed to solve problems that have another term attached to them, and that is "embarassingly parallel" problems. Your average task is not embarassingly parallel, and thus, is difficult to scale to a massively parallel system. It would take a lot of effort, see?

      But some problems can use massively parallel computers, designed to solve embarassingly parallel problems.

    4. Re:Massively parallel?! by guaigean · · Score: 2, Interesting

      Additionally, many of these computers don't run just 1 application. IBM's blue gene, and many other Dept. of Energy/Defense/* computers run a large number of research applications, ranging from 10's to 1000's of cores. It is very rare that a single program gets to run on such a large machine for any length of time by itself, so in most cases, programs don't have to scale to 100,000 PE's, but rather they scale to hundreds or a few thousand. Far more applications can scale well to hundreds than thousands, and still have reasonable speedup.

      --
      Microsoft Sucks, F/OSS Rocks. I get mod points now right?
    5. Re:Massively parallel?! by YenTheFirst · · Score: 1

      ...maybe it should be 'degrees of parallel scalability'

      i.e. [these algorithim's are] massively parallel scalable.

      buzzwords help too.

      --
      It's not stupid. It's Advanced.
    6. Re:Massively parallel?! by hevenor · · Score: 1

      The top parents point about the term parallel is correct in the literal sense. Parallel is true or false and there is no spectrum of parallel in the mathematical sense. The term 'concurrent processing' might be more correct (degrees of correctness?) but parallel has slipped into common language.

      Identifying problems that are well suited for a multi-processor platform can be quantified. It's hard to scale up when you define it as 4 integers. Try finding the maximum of n integers.

      aArray[1..n]

      int parallelMAx(array, lowIndex, highIndex){

        if(highIndex-lowIndex == 1)
              return max(aArray[lowIndex, aArray[highindex])
        else
              open new thread and calculate a=parallelMax(array, bottomOfRange, middle)
              open new thread and calculate b=parallelMax(array, bottomOfRange, topOfRange)
              return max(a,b)
        end
      }

      With one processor we need to check every integer and compare it to the current max. This takes at least n comparisons. With n processors we can do half of these at the same time giving us the same number comparisons in order log(n) iterations instead of n iterations. So finding a max will see some benefit from having more processors and that benefit is on the order of n-log(n).

      This benefit will be different for different problems (given the best known algorithm) and then sorting these benefits you could get a 'spectrum of benefit of concurrency' which would denote the 'degrees of parallelism that the top parent is speaking of.

    7. Re:Massively parallel?! by Anonymous Coward · · Score: 0

      Can these applications determine the optimal number of cores dynamically by measuring their own performance, or does the O/S just give them whatever's available at the time? (I'm just curious.)

    8. Re:Massively parallel?! by guaigean · · Score: 3, Interesting

      In most cases, researchers request a specific number of cores, based on experience of how well their code scales. Some codes to auto-scale, depending on available cores, but these are rarer. The way it works is in a batch queue system... Users submit a job required 2000 cores, and wait until that many are available. Then, when the cores become available, their job runs for 6-48hrs or more, depending on the job. In most cases, a large number of researchers are often in contention for computing time, and wait their turn in line. The good ones tend to understand the system better, and will submit workloads that reflect the current available resources, thus limiting the time their work spends sitting in the queue.

      --
      Microsoft Sucks, F/OSS Rocks. I get mod points now right?
    9. Re:Massively parallel?! by The+boojum · · Score: 1

      Not to mention that speedup (i.e., best serial running time over best parallel running time) and efficiency (i.e., speedup over number of processors) are well defined ways of quantifying how well a piece of a code scales in parallelism. If, for example, you're still getting 95% efficiency running on 2000 nodes, then I'd call that pretty darn good given Ahmdal's Law and "massively scalable". The way the efficiency curve falls off as you increase the number of processors tells you a lot about how parallel a piece of code is.

      Granted massively parallel is a fuzzy, qualitative phrase, but parallel efficiency at high numbers of processors is a pretty good measure.

  6. Wow, can you imagine a Beowulf cluster of these? by $RANDOMLUSER · · Score: 3, Interesting

    Looking at his examples (Search, Ecommerce, Software-as-a-Service, Infrastructure-as-a-Service, Fraud Detection) I have to think "wow, single point of failure". Lots and lots of fault-tolerance needed to put all your eggs in one basket like that.

    --
    No folly is more costly than the folly of intolerant idealism. - Winston Churchill
  7. Re:Wow, can you imagine a Beowulf cluster of these by Lithdren · · Score: 1

    Good point. Single point of failur not only causes your entire system to go down, but stops the several billion processes you're running all at once. How long would it take to get things running again if something simple stopped? How long if its a processor that fries out? An hour? A day? Several days? How much money are you losing when that happends?

  8. Prognosticating by truthsearch · · Score: 2, Interesting

    The first half of his list seems a bit flighty. They lean more towards buzz and less useful applications. But the second half is much more practical and likely. There are many potentially interesting applications coming up, but I don't think we'll directly see most of them publicly on the internet. So I give him a +0.5 Insightful.

    1. Re:Prognosticating by Anonymous Coward · · Score: 0

      Just because you don't know what the first half is used for doesn't mean there aren't hundreds of multinationals pouring millions of dollars to use the stuff in order to improve their products and stay competitive.

    2. Re:Prognosticating by kramulous · · Score: 1

      Touche' .... The first list may seem to be 'flighty' to some people, but others know that they are the core of the second list.

      --
      .
  9. Right ... know where we are ... for traffic ... by ghstomahawks · · Score: 1

    Yes ... it includes RFID tracking to reduce theft, and ... manage traffic!?!? We need our next generations of supercomputers to follow you around, knowing where you are at all times ... so umm, we can change the traffic lights when the roads get busy for you .... ~Director of NSA Domestic Spying Program

  10. Re:Wow, can you imagine a Beowulf cluster of these by Sneakernets · · Score: 1

    There had better be a CPU dedicated to Error detection and correction!

    --
    "No freeman shall ever be debarred the use of arms." -- Thomas Jefferson
  11. IBM's BlueGene is a massively parallel supercomput by mmell · · Score: 2, Informative
    With hundreds (thousands) of independent cores working on the same problem, it's parallel.

    I've seen the things - trust me, they're massive!

  12. Is it better then... by Anonymous Coward · · Score: 0

    The Big Crunch

  13. Slow news day, huh? by xxxJonBoyxxx · · Score: 3, Insightful

    Slow news day, huh?

    Can we please have a "no links to random, boring blogs week" on Slashdot?

    1. Re:Slow news day, huh? by guaigean · · Score: 1

      If anything is "News for Nerds", it is applications for and scalability of Supercomputers. Just because it doesn't come from NBC or Fox News doesn't mean it is uninteresting, or that it isn't news. A big benefit of the internet is that average people can post news and stories, without the funding of a major news corporation.

      --
      Microsoft Sucks, F/OSS Rocks. I get mod points now right?
    2. Re:Slow news day, huh? by xxxJonBoyxxx · · Score: 1

      If anything is "News for Nerds", it is applications for and scalability of Supercomputers.


      However, this guy's blog might have been taken from a couple of high-school stoners. (Supercomputers for weather? Who would have thunk it?) There's really no insight on this blog; there's nothing that the average geek wouldn't be able to rattle off in five seconds without really thinking.

      average people can post news and stories, without the funding of a major news corporation.


      I know - this blogger is quite "average". However, I don't like wasting my time with "average"; I'm at least an "above-average" kind of guy.

  14. What about how the design scales? by EmbeddedJanitor · · Score: 3, Informative
    The problem with many parallel disigns is that they are limited by Amdahl's Law http://en.wikipedia.org/wiki/Amdahl's_Law such that a few CPUs make sense, but large numbers don't (except to the salesman's commission).

    The term "massively parallel" indicates a system operating without those constraint.

    --
    Engineering is the art of compromise.
    1. Re:What about how the design scales? by hackstraw · · Score: 2, Interesting


      Amdahl's law isn't really a problem, its just a thing. The law of gravity is not a problem, its just a thing.

      Supercomputing is really cool with embarasingly parallel problems and things like superlinear speedup. Supercomputing is a mess because its basically a hack. Funding and support are always issues. Even though we buy thousands of CPUs at a time, they are still a blip on the radar compared to regular server sales, and vendors don't cater to supercomputing because ironically, its not much of a market for the systems.

      Now, to actually read the article and see what we are talking about :)

    2. Re:What about how the design scales? by Nefarious+Wheel · · Score: 1
      A friend of mine was once head of the massively parallel CS lab at a major Australian university. He observed that it was sad to see massively parallel processes constructed to solve a single problem on a Connection Machine devolve to the point where only a few processors in the corner were doing any work.

      I think he was on to something fundamental about problem organisation, myself -- why else would large, otherwise healthy functioning companies end up with nine coxwains per rower?

      --
      Do not mock my vision of impractical footwear
  15. that paragraph by jjeffries · · Score: 2, Funny
    Did anyone else picture this story's text blurb being read by fast-talker John Moshitta, or is it just me?


    bah weep grana weep minibom

  16. Oblig by true_hacker · · Score: 0

    Imagine a Beowulf cluster of these. And will it run Linux?

  17. The Folding@home SMP client is ready. by Anonymous Coward · · Score: 2, Interesting

    "Using supercomputers to test the next-generation version of the SMP code, we get good scaling to many more cores than in the Intel prototype, and we expect to do even better in the future."

    http://forum.folding-community.org/fpost166684.htm l#166684
    http://fahwiki.net/index.php/SMP_client

  18. Re: Amdahl's Law by gr8_phk · · Score: 2, Interesting

    My main side project is real time ray tracing software. It is very nearly not subject to Amdahl's Law. In the terminology of the Wiki article, F is approximately zero for Ray Tracing. It will scale very well past 10 cores and may well be able to make good use of 100 cores. Memory bandwidth seems to be the limiting factor (that determines F) but that may not be a problem with enough cache and good code. It's also the only potential mass-market use for a lot of cores. nVidia your days are numbered.

  19. Leeeerrroooooy Jennnnkkinnnnsss!!! by KatchooNJ · · Score: 3, Funny

    "Give me a SUPER number crunch."

    "We have a 32.33, repeating, of course, percent chance of survival."

    "That's better than we usually do."

    --
    "Never give up, for that is just the time and place when the tide will change." -Harriet Beecher Stowe ^_^
  20. Redundancy? by nschubach · · Score: 1

    # Dense linear algebra
    # Sparse linear algebra

    What about Average linear algebra?

    # Structured grids
    # Unstructured grids

    Are there any other types?

    (** Warning: Car analogy...)
    Isn't that kind of like selling a car and listing on the spec sheet:
    # Goes slow
    # Goes fast

    --
    Every time I start to have faith in humanity, I ruin it by driving to work between 7 and 8 am.
    1. Re:Redundancy? by joto · · Score: 2, Informative

      # Dense linear algebra
      # Sparse linear algebra

      What about Average linear algebra?

      For sparse matrices, you can use algorithms that are vastly more efficient than the algorithms you otherwise would use for non-sparce matrices of the same size. This is called sparse linear algebra. If you can't use the algorithms for sparse linear algebra, it doesn't matter whether you call it "dense", "average", "standard", "normal", "regular", "common", or what the fuck else.

      # Structured grids
      # Unstructured grids

      Are there any other types?

      Again, this refers to the algorithms used. There are different algorithms you use in each case. And sure enough, there are other types, such as grids structured in non-standard ways, that some mathematicians might have developed special algorithms for. However, these are the common types of computations run.

      (** Warning: Car analogy...)
      Isn't that kind of like selling a car and listing on the spec sheet:
      # Goes slow
      # Goes fast

      No. It's like listing the major uses of motorized vehicles, and among them putting "transport of goods", and "transport of people", and then have some dude on the Internet point out that this sounds funny.

  21. Bill McColl was my thesis supervisor at Oxford by Anonymous Coward · · Score: 1, Interesting

    Bill McColl, for those who aren't familiar with him, was the driving force behind the bulk synchronous parallel (BSP) model of programming. This model, while available in the MPI-2 spec, is not widely used as is. Instead, its major contribution is inspiring remote direct memory access and the partitioned global address space, among others.

    Last time we spoke, Bill said that he was interested in the issue of massively scaled computers that can handle fault tolerance pre-emptively. He compared today's supercomputers (Blue Gene, Cray XT4, Altix, etc) to a racing car that was really fast for a few hours a week, but wasn't even reliable enough to get the groceries. He was also interested in computers that can handle a continuous influx of data (as his blog post mentions), similar to managing millions of RSS feeds.

    An example application domain for this stuff would be Wall Street firms that have to run time series analysis on streaming data. Prof. McColl is really on the right track here.

  22. Re:Wow, can you imagine a Beowulf cluster of these by bhmit1 · · Score: 1

    Fault-tolerance is either built into the problem or into the application. Take for example search, if one search server on the backend that is handling 0.1% of the web sites goes down, you may not know or even care that those results are missing (assuming the system doesn't have something built in to give that query to another node searching the same dataset).

    In fraud detection, thinking of the credit card companies, it's typically looking for patterns after the transaction has already gone through, and if one node of the cluster goes down, maybe you give the same transaction list to another node. You never find every case of fraud this way, but you want something that can search as many (or all) of the transactions as quickly as possible to reduce the time between the first instance and shutting down the account.

    For the other examples, you just build it into the system, e.g. one HA broker on the front that can give out a task to another node if the first one goes down. When you build a system like this, single points of failure in the server farm aren't the concern. It's the mean time between failures and the process to replace nodes, the power and cooling requirements, failure points outside of the nodes, etc.

  23. Re: Amdahl's Law by joto · · Score: 1

    It's also the only potential mass-market use for a lot of cores.

    Either that, or your imagination is lacking somewhat. Personally, I've wanted lots of cores sinces I was in kindergarten. I'm quite sure I can find a use for them all.

  24. Re: Amdahl's Law by drinkypoo · · Score: 2, Interesting

    It's also the only potential mass-market use for a lot of cores.

    What? You are on drugs, yes? And not the good kind?

    What about video encoding? Besides codec parallelism, you can also parallelize the distance between two keyframes, handing that chunk off to a core (or node) for processing. This is very mass-market - more and more people want to make snazzy home movies.

    In fact, far more people would like to do this than render 3d movies.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  25. Why do I want multiple cores? by tbuskey · · Score: 1

    As a standard power user running internet apps/office apps/video processing (home/tv)

    At one point you have the app running on a core, the OS on one, the graphics on the GPU, the network on a cpu. You get lower latency because your app's cpu doesn't have to time slice with the others.

    I can see parallel makes, conversion (wav2mp3, video formats etc), formatting (commercial skipping, panorama stitching). I/O is going to be the ultimate bottleneck.

    What kind of consumer applications would benefit?

    1. Re:Why do I want multiple cores? by David+Greene · · Score: 2, Interesting
      • Anything dealing with graphs (searching, for example)
      • Many things dealing with large data arrays (video, for example, as you pointed out)
      • Anything that can be pipelined (software radio, for example)
      • Lots of physics modeling (games, for example)
      • A bunch of stuff we've not even thought of yet

      Some off-the-wall future consumer things to consider:

      • Homebrew processor (or any) design (design space searching could be really interesting)
      • Dynamic compilation/jit/adaptive software/introspective computing
      • Immersive gaming (CAVE in the home, advanced AI, physics & video, etc.)

      I think things get particularly interesting thinking about many-core in handheld devices. Software radio could be tremendously useful there. Route planning using a network of GPS-enabled handheld devices would be cool. Background searching could be used for a lot of things.

      --

    2. Re:Why do I want multiple cores? by Anonymous Coward · · Score: 0

      I/O is going to be the ultimate bottleneck
      As always, but I/O scheduling may improve.

  26. Errors by Beryllium+Sphere(tm) · · Score: 2, Insightful

    ...web players such as Google, Microsoft, Yahoo, Amazon and eBay. For these systems, scale, resilience and real-time throughput are major concerns. In contrast to the other two classes, these systems need to process vast numbers of simultaneous tasks, to deliver guaranteed real-time performance...

    None of those offer or require real-time guarantees.

    Unlike today's search engines and data mining systems, which essentially search the past, continuous search is about searching the present and the future.

    Google Alerts is here now.

    A better article would have started with the table that defines "supercruncher" and proceeded to describe the architectural issues of building one. Ideally it would have addressed the software challenges.
  27. Re:Wow, can you imagine a Beowulf cluster of these by Anonymous Coward · · Score: 0

    Ecommerce has single points of failure? Where is that? In the replicated database? No, it must be in the replicated webservers.

    Or maybe not. Maybe instead you don't know what you're talking about. Whodathunkit?

  28. Re:Wow, can you imagine a Beowulf cluster of these by Frumious+Wombat · · Score: 1

    It's rare for an entire machine like that to fail. More likely is 1 processor board, or similar subsystem, which you can design for (I didn't get a result back, try again) in software, and, like the T3E which shipped with redundant processors, in hardware as well. If you have enough processors, you could stripe your job across several, so if one doesn't return a result, a second one will. Now, locating your only one of these machines in California might not be the best idea (we had an earthquake which started a eucalyptus grove fire, but don't worry, the mudslide put it out), but it's unlikely that you'll lose an entire one.

    Just to geek out for a moment, picture a system large enough to finally troll through all of that data NASA brought back from the Mariner missions, and cross-reference it against what they get daily now from the various Mars probes. Finally turn all of that data into information, as the blog says.

    --
    the more accurate the calculations became, the more the concepts tended to vanish into thin air. R. S. Mulliken
  29. Re: Amdahl's Law by David+Greene · · Score: 2, Informative

    100 cores is not massively parallel. The kind of scaling we're talking about is much higher. Think thousands of cores each with hundreds of threads.

    This is the kind of scaling that weather centers are just starting to reach today. It's the kind of scaling that will require a radical rethinking of how consumer software is designed and what tools we need to make that design process easier.

    In this world, software is king. You won't care who your chip vendor is. You'll care who provides your compiler, debugger, performance analysis tools and other such things.

    --

  30. The Weightless Economy in disguise by heroine · · Score: 2, Informative

    Fascinating that a story purporting to be about supercomputers is actually a summary of Weightless Economy theory. The theory is that the wealthiest countries can't achieve more wealth by implementing things anymore. They can't increase their net worth by manufacturing or solving math problems. They have to turn instead to philosophical goals like people management, interpreting literature, creating works of art.

    The supercomputer function is still the same. It still solves algebra, n-body methods, structured grids, and finite state machines. The user of the supercomputer is different. The user is now living on $1 a day in Mongolia.

    For the wealthiest countries to stay wealthy, they have to focus on not the computing part but marketing the computing, creating the interface to the math, managing the business around the computing.

    1. Re:The Weightless Economy in disguise by mov_eax_eax · · Score: 1

      i disagree, why solving climate models is not a work of art?, why an astronomical simulation or a chess machine have a disconnection with philosophy?, the computing part is an implementation thing, yes, but they have a part of philosophy and art, at least in the eye of the implementors.

  31. Re:Wow, can you imagine a Beowulf cluster of these by Anonymous Coward · · Score: 0

    really? you pulled the T3E out of your ass? Well done, sir. So yea, this is why there's an emerging market and lots of research going into things like predictive failure models and checkpointing where you can have a backend engine throw up a flag whenever it detects the conditions of a likely failure, check point and move. This search stuff. You can see vendors looking at running non-mpi-ified code on machines that embrace the MPP model, this opens the door for running massive installations like Google on a BlueGene and a filesystem cluster (I know I just said that and it's not so feasible with BlueGene/L's model of partitioning, but I think that's a minor limitation).

  32. "Real-time" only in the casual non-technical sense by Doug+Jensen · · Score: 1

    As used in the field of "real-time computing/systems," satisfying time constraints is a correctness criterion, not simply a performance metric.

    --
    Doug Jensen
  33. of course they do by Anonymous Coward · · Score: 0

    It means more than just a few nodes, Mr. pedantic dipsquat. Apparently, everyone on the planet but you knows what it means. Parallel processing done in a massive manner with highly tweaked software to take advantage of the processors and i/o, as opposed to two old pentium ones with a crossover cable and 5 lines of JS. And then run through market speak to make it sound cool. And that's it. Sure, you can make it sound stupid like a standup comic pointing out the old "pair of pants" routine, but that's all it is, too, a word joke. Get over it, george carlin and steve martin and robin williams and various other professional funny guy fast talkers and language noticers do it better than you, PLUS get paid well for it, PLUS it's fun to listen to, at least the first time. "jumbo shrimp" "driveway or parkway?" and etc.

  34. Auto scaling by marcus · · Score: 1

    Some time ago while doing some research into 'massively parallel' applications for a bio-research company I wrote an auto scaling hack on top of the Pov Ray PVM port. It worked fairly well at monitoring cpu loads across a network, dicing up the scenes to be rendered and shipping off chunks of work to various CPUs as they were available.

    Overall the research project covered scaling from the CPU/core through cache to DRAM to disk to network even up to the point of when you'd have to actually scale the dispatcher in order to keep all of the processors busy. It was interesting stuff and produced some nice graphs of performance curves clearly indicating what was the bottleneck for each type of computing problem that we evaluated.

    Damn it was nice working for that company.

    --
    Good judgement comes from experience, and experience comes from bad judgement.
    - W. Wriston, former Citibank CEO