Slashdot Mirror


There's A Cluster of 750 Raspberry Pi's at Los Alamos National Lab (insidehpc.com)

Slashdot reader overheardinpdx shares a video from the SC17 supercomputing conference where Bruce Tulloch from BitScope "describes a low-cost Rasberry Pi cluster that Los Alamos National Lab is using to simulate large-scale supercomputers." Slashdot reader mspohr describes them as "five rack-mount Bitscope Cluster Modules, each with 150 Raspberry Pi boards with integrated network switches." With each of the 750 chips packing four cores, it offers a 3,000-core highly parallelizable platform that emulates an ARM-based supercomputer, allowing researchers to test development code without requiring a power-hungry machine at significant cost to the taxpayer. The full 750-node cluster, running 2-3 W per processor, runs at 1000W idle, 3000W at typical and 4000W at peak (with the switches) and is substantially cheaper, if also computationally a lot slower. After development using the Pi clusters, frameworks can then be ported to the larger scale supercomputers available at Los Alamos National Lab, such as Trinity and Crossroads.
BitScope's Tulloch points out the cluster is fully integrated with the network switching infrastructure at Los Alamos National Lab, and applauds the Raspberry Bi cluster as "affordable, scalable, highly parallel testbed for high-performance-computing system-software developers."

128 comments

  1. It's an older meme, sir, but it checks out... by Entrope · · Score: 5, Funny

    Did they make a Beowulf cluster of those?

    1. Re:It's an older meme, sir, but it checks out... by Anonymous Coward · · Score: 0

      nah, they're just debugging the next iOS.

    2. Re:It's an older meme, sir, but it checks out... by Anonymous Coward · · Score: 1

      I christen thee a Finnesburg cluster.

    3. Re:It's an older meme, sir, but it checks out... by jrmcferren · · Score: 1

      Now we just need a cluster of the clusters in the name of Science.

      --
      sudo mod me up
    4. Re:It's an older meme, sir, but it checks out... by Anonymous Coward · · Score: 0

      Now we just need a cluster of the clusters in the name of Science.

      They are called grids and have been available in 10+ years. For example "Worldwide LHC Computing Grid"
      "

  2. Re: Obvious unimportant topic by Anonymous Coward · · Score: 1

    Fuck Beta!

  3. No wonder pi's are hard to buy by sjwest · · Score: 1

    When somebody buys 750 all at once.

    It was my experience that pi's are hard to buy so i gave up trying to get one. Mind you when people use ancient rasbian os and make 'secure' email servers on port 26 and then get called out for issues it is good to see that somebody is using them properly instead of poorly.

    1. Re:No wonder pi's are hard to buy by hcs_$reboot · · Score: 1

      The PI is sold everywhere, at the site + amazon etc... got mine quickly. Sure if you try to get the latest thing the next day, that could be challenging.

      --
      Slashdot, fix the reply notifications... You won't get away with it...
    2. Re:No wonder pi's are hard to buy by serviscope_minor · · Score: 1

      It was my experience that pi's are hard to buy so i gave up trying to get one.

      Maybe once upon a time...?

      You can get them from many of the major vendors these days. I usually get mine from RS but I suspect the likes of Farnell and so on sell them too. I expect Amazon sells them too!

      --
      SJW n. One who posts facts.
    3. Re:No wonder pi's are hard to buy by Zaiff+Urgulbunger · · Score: 1

      Mind you when people use ancient rasbian os and make 'secure' email servers on port 26 and then get called out for issues it is good to see that somebody is using them properly instead of poorly.

      You know Raspbian is basically Debian? So it's pretty solid for the most part.

    4. Re:No wonder pi's are hard to buy by Anonymous Coward · · Score: 2, Informative

      Element 14 has 80,000 in stock and ready to ship.

      750 would be a small order.

    5. Re:No wonder pi's are hard to buy by ShanghaiBill · · Score: 1

      The PI is sold everywhere, at the site + amazon etc...

      The RPi Zero has been sold out continuously everywhere for months.

    6. Re: No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      Theyâ(TM)ve been regularly in stock at my local Micro Center for months. I buy one almost every trip.

    7. Re: No wonder pi's are hard to buy by ShanghaiBill · · Score: 1

      Theyâ(TM)ve been regularly in stock at my local Micro Center for months.

      Micro Center's website listed them as "out of stock" for at least the last 4 months. They are currently not listed at all. So apparently they no longer carry them, or at least are no longer taking new orders.

      I buy one almost every trip.

      A Zero? Without buying a $30 "development kit" that includes a $5 Pi? I don't think so.

    8. Re:No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      Yeah usually by greedy retailers that package up the few remaining ones into $100 "dev kits" which include all kinds of extra stupid junk that anyone playing with electronics or previous PIs for any decent time doesn't need anymore. Try actually finding Pis on their own, going at the suggested retail prices. I usually just stay one generation behind on the PIs. I only got my 1st Pi2 earlier this year. Whenever the Pi4 comes out, you should be able to find the Pi3 at reasonable prices.

    9. Re:No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      This seems like a huge waste of time, money and space, but hey, anyone can make a cluster out of weak hardware.

      The Raspberry Pi is a super weak system. It's about 1/4 to 1/2 the speed of an iPhone, it's even weaker than the CPU in the NES Mini classic/SNES Mini classic and most of it's appeal is the price ($5-$20) to do IoT and robotics projects that don't require the cpu power or energy input.

      So by making a cluster out of them, probably wastes a huge amount of energy unless they came up with their own power supply system, because plugging in 750 AC adapters is absolutely asinine.

    10. Re: No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      Ya, those idiots at LANL probably don't know the first thing about building clusters and power efficiency...

    11. Re: No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      $35 on Amazon, same day delivery for me.

      I actually just bought three for a project.

    12. Re: No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      Theyâ(TM)ve been regularly in stock at my local Micro Center for months.

      Micro Center's website listed them as "out of stock" for at least the last 4 months. They are currently not listed at all. So apparently they no longer carry them, or at least are no longer taking new orders.

      I buy one almost every trip.

      A Zero? Without buying a $30 "development kit" that includes a $5 Pi? I don't think so.

      Micro Center in Chicago has them. They are limited to 1 per customer, but they had no problem selling me a Zero and a Zero W in a single transaction last night. For some reason, the Zero W is currently US$4.99 while the Zero is US$5.00.

    13. Re:No wonder pi's are hard to buy by piojo · · Score: 1

      Not a programmer, are you?

      --
      A cat can't teach a dog to bark.
    14. Re: No wonder pi's are hard to buy by Anonymous Coward · · Score: 0

      Well they are in stock in the country that they come from (the UK). But here most of the shops have 1 per customer policy (actually one per shipment), this still works for the shops because they make more money through selling all the hats and hobbyist kit to actually do interesting projects with the PI, maybe the problem is they don't have that in America and allow greedy people to steal them all.

  4. It's be cool to build one at home by Anonymous Coward · · Score: 0

    This would be an amazing side project for someone to do at home. And it wouldn't break the bank (too badly... maybe the cost of a new car). What a talking point that would be at a job interview.

  5. Raspberry Bi? by HalAtWork · · Score: 1

    As in bidirectional communication I assume!

  6. Raspberry Bi? by Anonymous Coward · · Score: 0

    Is that similar to a Raspberry Trans?

  7. Re: Cost by Anonymous Coward · · Score: 5, Insightful

    You are missing the point! The idea is not to have an super computer but to emulate one. Writing code for stuff like thus is hard and running it on the real deal is expensive. This way the can emulate 750 core system at an fraction of the cost.

  8. Re:Cost by JaredOfEuropa · · Score: 1

    Apparently the point is to simulate a powerful machine with many cores, so that people can develop and optimize their code without requiring CPU time on the actual (very expensive) machine.

    --
    If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
  9. Re: Cost by Anonymous Coward · · Score: 0

    I have 50 iPhones. Cost me under $1k

  10. Excellent way to learn parallel programming. by Anonymous Coward · · Score: 1

    If you cannot get 1000's of slow cpu's to scale, then wasting debug time on the big fast server is really a waste. Today's programmers need to learn how it used to be. Even with using RPI's they have an advantage. The network is much faster than what we had 20 or 30 years years ago. Internal busses are faster, ram/memory is faster, caches are faster. This is a smart way to spend money for a bringup development environment on the cheap.

    1. Re: Excellent way to learn parallel programming. by Anonymous Coward · · Score: 0

      Learning, yes. Performance tuning, no. Modern supercomputers communication performance is vastly different. Even if you get something running on this cluster, you are wasting your time basically since the issue you need to fix to make it fast may not be visible. It's cute. You can learn parallel computing on your multicore laptop, but that doesn't mean you can get code to run fast on an infiniband based distributed cluster with 10,000 cores. Once researchers get something running, they stop optimizing even if is it only a few times faster no matter how much better it could be. This cluster just encourages that behavior.

  11. Re: Cost by Anonymous Coward · · Score: 5, Informative

    You get effect of network latency to induce concurrency paradoxes that wouldn't happen on a shared memory system.

    ObCarAnalogy: a single bus can move a lot of people, but if you're modeling highway traffic, you want to use many independent cars.

  12. Re:Obvious unimportant topic by Zero__Kelvin · · Score: 5, Insightful

    Not really, but it does show that there are a lot more idiots like you coming here, and a lot fewer of the people who belong here. My very first thought was "Holy shit! Something that actually belongs on Slashdot on Slashdot!" If your thought was "meh" then I have no idea why you even come here.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  13. Re: Cost by mean+pun · · Score: 5, Insightful

    You are missing the point! This way the can emulate 750 core system at an fraction of the cost.

    So, what point am I missing? The Xeon phi 7290 is 4k$ and has 72 cores, you can get 10 of those and get way more speed, shared memory benefit etc...

    Entirely different architecture. The point of this scale model is to have a cluster of compute nodes with TCP/IP communication between them.

  14. Re: Cost by Anonymous Coward · · Score: 0

    You get effect of network latency to induce concurrency paradoxes that wouldn't happen on a shared memory system.

    You can run MPI on a shared memory system. It will have no problem uncovering any concurrency race conditions.

  15. Re: Cost by Anonymous Coward · · Score: 0

    Entirely different architecture. The point of this scale model is to have a cluster of compute nodes with TCP/IP communication between them.

    Which is completely useless since the real machine will have a completely different interconnect.

  16. Are they running 64-bit os? by Anonymous Coward · · Score: 0

    Are they running 64-bit os? If so, they can tap into significant performance from the arm-64/NEON/SIMD/crypto instructions, etc.

  17. Re: Cost by serviscope_minor · · Score: 5, Insightful

    So, what point am I missing? The Xeon phi 7290 is 4k$ and has 72 cores, you can get 10 of those and get way more speed, shared memory benefit etc...

    The shared memory is a detrator not a benefit if you're trying to have something which emulates an expensive distributed architeture. The point isn't to get lots of speed, it's to get a bunch of cores distributed over a local network in order to get a cheap test bed emulation of a much larger machine.

    --
    SJW n. One who posts facts.
  18. Attack of the Errant Apostrophe by tsqr · · Score: 1

    You don't need a supercomputer to figure out that the headline is poor usage. The Chicago Manual of Style will do that for you.

  19. You mean 75% fewer cores, fewer connects, no RAM by raymorris · · Score: 5, Informative

    10 CPUs with 72 cores each is 720 cores.
    750 SOCs with 4 cores each is 3,000 cores (and RAM and motherboards included).

    The point is to have a massive number of cores in a large number of machines, to simulate a large number of machines, at the budget point. Your idea would have 75% fewer cores.

    > shared memory

    Yep, that's another problem with your idea. It would no longer be an accurate simulation. Well except your plan doesn't include any RAM at all. Or motherboards, networking, etc. You're going to need to buy 750 network cards to simulate 750 machines, motherboards each capable of holding 18 cards, a number of storage devices, etc. So maybe FIVE 7290 CPUs with exotic motherboards plus RAM, network cards, storage, etc. Five 7290s would provide 360 cores, vs the 3,000 cores they got with the Pis.

    Now AFTER the research yields fruit, in a couple years someone might want to put the ideas into production using fifty 72-core processors which may cost $2,000 each.

  20. Re: Cost by Anonymous Coward · · Score: 0

    You can run MPI on a shared memory system. It will have no problem uncovering any concurrency race conditions.

    "There are no bugs in my code."

  21. Learn how to use a fucking apostrophe by Anonymous Coward · · Score: 0

    Learn how to use a fucking apostrophe

    1. Re:Learn how to use a fucking apostrophe by DontBeAMoran · · Score: 1

      Thereâ(TM)s A Cluster of 750 Raspberry Piâ(TM)s at Los Alamos National Lab

      There. Happy now?

      --
      #DeleteFacebook
    2. Re:Learn how to use a fucking apostrophe by Anonymous Coward · · Score: 1

      No, there isn't a square-boxed question mark.

  22. horse-shit by Anonymous Coward · · Score: 0

    Once researchers get something running, they stop optimizing even if is it only a few times faster no matter how much better it could be. This cluster just encourages that behavior.

    pure unsubstantiated bullshit pulled straight from your ass

    1. Re: horse-shit by Anonymous Coward · · Score: 0

      No, real world. I ran a supercomputing lab for 20 years at a research lab. Researchers don't care how fast it could go. Is it faster than before? Great, keep running. Next year the CPUs will be twice as fast or twice as many. They are enormous pressure for results not tuning.

    2. Re: horse-shit by DontBeAMoran · · Score: 2

      Next year the CPUs will be twice as fast or twice as many.

      That was 20 years ago. Today we don't get those kinds of performance leaps anymore.

      --
      #DeleteFacebook
    3. Re: horse-shit by Anonymous Coward · · Score: 0

      All bow down to the Trump hater.

    4. Re: horse-shit by Anonymous Coward · · Score: 0

      Yes, we do. They keep packing more and more cores on a single chip. So the local compute power keeps going up. It's relatively easy to write parallel code on a single core with multiple threads. Spreading your job across a network is an order of magnitude harder since you need to understand where/how you computation can be split up. I have seen a lot of dusty deck codes go parallel on a single node with a few directives. It usually isn't that easy to make the next step.

    5. Re: horse-shit by Anonymous Coward · · Score: 0

      Was the lab in your mom's basement?

    6. Re: horse-shit by Anonymous Coward · · Score: 0

      It depends on what you are doing. As far as single threaded tasks, we haven't gotten a whole lot more computing power per core in the last decade, these days its a race of cramming the most of these lower speed cores into the same chip. If we were ramping up speed like we were in the 90's to early 2000's I would suspect we would have 10+Ghz CPUs by now. I suspect the same of the next generation Pi will probably have individual cores at approximately the same speed we have now, but probably 6 or 8 cores on the SOC.

      Only if you are doing highly parallelized multi threaded tasks do all those extra cores help you. If you are the typical Pi "hacker" writing python scripts to make a LED on the GPIO blink, more cores aint going to give you any benefit. your python script is only running on one of those cores.

    7. Re: horse-shit by Anonymous Coward · · Score: 0

      Researchers don't care how fast it could go.

      Bullshit. When I get assigned a certain, finite number of cpuhours for a grant that is for several research goals, including multiple grad students that need to get thesis work out of those hours, we make damn sure that we squeeze what we can out of those hours. This involves a lot of runs on a tiny cluster we have within our department, both for optimization and scaling estimates, before we send off codes to the larger machine.

      And this isn't a recent phenomena as far as I can tell, as I cut my teeth 20 years ago as a student by doing hardware tests and optimization for projects. If the lab you ran didn't have people stressed out over optimizing stuff, and it wasn't due to lack of demand, who's fault is that for not applying enough pressure and checks against wasting time?

  23. Is it really any useful? by Anonymous Coward · · Score: 0

    I would think that this could be solved more efficiently, albeit less fun, by a virtual cluster. The hardware is different enough from the real supercomputer anyway that performance benchmarking is probably out of the question.

  24. attack of the idiot by Anonymous Coward · · Score: 0

    The Chicago Manual of Style will do that for you.

    Maybe you need a supercomputer to figure out that books don't do any "figuring out"

  25. Re:Fuck Net Neutrality by DontBeAMoran · · Score: 1

    This happens so often that I think we need a new mod:
    Score: -1 Wrong topic, you idiot

    --
    #DeleteFacebook
  26. Re: Cost by gerf · · Score: 1

    Here you can unplug a node to simulate a hardware failure. The latency is more real world between nodes. Cache levels are more similar (L1 L2 RAM) , hardware levels (nic, bridge, CPU). It's a cheap approximation. Leave it at that.

  27. Re: Cost by Anonymous Coward · · Score: 0

    "Let's test this freeway system at small scale"
    "Nah let's take the airplane, no need to test it in the real world"

    Your continuing statements are not relevant.

  28. Re: Cost by Anonymous Coward · · Score: 0

    Your analogies have zero relevance to how HPCs work.

  29. Re: Cost by kyrsjo · · Score: 2

    Not useless if you're debugging queue systems, schedulers etc.

  30. Re:Cost by kyrsjo · · Score: 2

    ROFL. We're not talking about debugging the scientific number crunching code that will run on the actual cluster, but the cluster management software. The actual jobs to run may very well just be doing sleep(10000*rand()); if rand()0.1 call WriteAllTheDiskSpace; else if rand() 0.2 then call segfault_horribly(); else return SUCCESS;. etc.;, one should probably add in a few more "bad things", MPI calls etc.

  31. Re: Cost by Anonymous Coward · · Score: 0

    You could do all that by setting up a bunch of VMs on a shared memory machine. This cluster serves zero purpose.

  32. Re: Cost by mean+pun · · Score: 1

    Your analogies have zero relevance to how HPCs work.

    They are relevant. What is zero is your willingness to learn, or at least accept that other people know what they are doing.

  33. Re: Cost by Anonymous Coward · · Score: 0

    No they don’t. Morons like you have no idea how super computers actually work.

    The only difference between this cluster and a shared memory machine is that the shared memory machine will use less energy and will be useful for actual scientific work not just your idiotic debugging scenarios.

  34. Re:You mean 75% fewer cores, fewer connects, no RA by Anonymous Coward · · Score: 1

    1440 is NOT 3000!!! AND
    your power budget went to
    hell in a hand basket.

    One other thing, your being
    a DICKWAD.

  35. Re:You mean 75% fewer cores, fewer connects, no RA by Anonymous Coward · · Score: 0

    No a single shared memory device will use less energy than a cluster plus networking equipment needed to run it. Plus you can run more than one process on a physical or virtual core. So you will have no problem simulating a job requireing 10-100 times the number of physical cores.

  36. Re: Cost by Anonymous Coward · · Score: 0

    Well, technically they could just run their nodes as a bunch of virtual machines and save money as well as gain performance. That would take out the networking part of the equation though, so it wouldn't be quite the same. On the other hand, if you're running test where networking and latencies might matter, I think using RPIs are a bit dubious considering you get 100Mbps tops over a pretty badly congested USB2 bus.

    Presumeably they know what they are doing, but I think they could have found something more suitable than the Pi and its yucky networking.

  37. Re: Cost by Anonymous Coward · · Score: 1

    The only difference between a cluster and and a shared memory machine is you pay by the minute on the cluster. If you haven't realized that consumption fee is the same whether you are debugging or doing actual science, then you are the moron who can't see the point of using a less powerful less costly cluster to do mockups.

  38. Re: Cost by Anonymous Coward · · Score: 0

    Well then get an F250. It has the capacity to haul thousands of pies worth in traffic. With dealer incentives you can get them for under fifty grand.

    Plus you can haul a decent boat. Granted it doesn't solve the goal of the scientists, but the other "xeon" poster doesn't understand that, so suggest a good truck instead

    Or the Ford.

  39. Re: Cost by Anonymous Coward · · Score: 0

    Wow you are a stupid moron. You can setup a que system that charges by the minute on a shared memory machine. By contrast, I have run scientific codes on clusters that do not charge by the minute.

    Try again dumbass.

  40. Re: Cost by Anonymous Coward · · Score: 0

    I think that other AC is at a level of idiocy simulating creamy dumpty.

  41. Re:Cost by Anonymous Coward · · Score: 0

    All of which you can do on a shared memory machine with the added bonus that the shared machine is useful for actual scientific work.

  42. Re: Cost by Anonymous Coward · · Score: 0

    Keep making moronic analogies dumbass.

  43. Re: Cost by rthille · · Score: 1

    The RPi modules are 4-core, so the cluster is 3000 cores.

    --
    Awesome furniture, accessories and cabinetry in Santa Rosa, CA: http://humanity-home.com/
  44. Re: Cost by guruevi · · Score: 1

    Interconnects don't matter much. Whether you use InfiniBand, GigE or serial, you're just pumping TCP packets.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  45. Re: Cost by Anonymous Coward · · Score: 0

    Erm, wrong.

  46. Pi's by Anonymous Coward · · Score: 0

    Nice apostrophe, bro!

  47. Re: Cost by MrMr · · Score: 2

    I don't think it is acceptable to make an understandable and relevant car analogy.

  48. Re:HEY MODERATOR! GO SODOMIZE YOURSELF WITH SAWBLA by Anonymous Coward · · Score: 0

    Go swallow your own cock.

  49. Re: Cost by OrangeTide · · Score: 1

    I wonder how many Commodore 64's I could emulate at once on a decent sized workstation. Maybe 500?

    --
    “Common sense is not so common.” — Voltaire
  50. Re: Cost by ShanghaiBill · · Score: 1

    You could do all that by setting up a bunch of VMs on a shared memory machine.

    That will give you different latencies and different bottlenecks. The point of this system is not to crunch data, but to serve as a testbed for parallel software development. It is possible that they also use VMs, but that would be in addition to this cluster rather than a replacement.

  51. Re:Cost by Anonymous Coward · · Score: 0

    Good luck buying another 3000 core computer that only uses 4000 W ..with that kind of cash..yes the point here is many cores and low power for simulation of the really bad ass systems.. not raw single core performance of the dev system

  52. Re:You mean 75% fewer cores, fewer connects, no RA by viperidaenz · · Score: 0

    Or you could use a single Xeon Phi, emulate the 750 Raspberry Pi's and the networking and still consume less power with more performance for a lower price.

  53. I hope it's not idling. by drinkypoo · · Score: 1

    1kW at idle is a lot. You could cut that down by shutting down Pis in banks as they went unused, and firing them up again as needed. It wouldn't require very much more hardware, just some microrelay boards which can be driven by some of the Pis themselves.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    1. Re:I hope it's not idling. by Anonymous Coward · · Score: 0

      1 kW is not a lot of power, it's the equivalent of someone always using the hand dryer in the bathroom. Especially compared to the power that los alamos uses as a whole!

    2. Re:I hope it's not idling. by drinkypoo · · Score: 1

      1 kW is not a lot of power, it's the equivalent of someone always using the hand dryer in the bathroom.

      It's not a lot of power on their scale, it's true. But it's a lot of power to waste idling on a low-power project, when it's easily avoidable.

      It's probably not idling much anyway.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    3. Re:I hope it's not idling. by Anonymous Coward · · Score: 0

      You're talking about a facility that uses multiple megawatts. Even the 4kwatts this thing uses at full power is probably a rounding error on their power bill.

    4. Re:I hope it's not idling. by dj245 · · Score: 1

      1kW at idle is a lot. You could cut that down by shutting down Pis in banks as they went unused, and firing them up again as needed. It wouldn't require very much more hardware, just some microrelay boards which can be driven by some of the Pis themselves.

      Electricity in New Mexico is $0.11 to $0.12 per kwh. So at maximum they would save $0.12 per hour. You would spend far more in labor/coding/hardware than you would ever save in power costs. Plus you may introduce bugs or other issues that would take even more time to fix or delay useful work.

      Do you work in academia?

      --
      Even those who arrange and design shrubberies are under considerable economic stress at this period in history.
  54. Big deal by fahrbot-bot · · Score: 1

    There's A Cluster of 750 Raspberry Pi's at Los Alamos National Lab

    I saw a bunch of them at the grocery store before Thanksgiving, next to the apple ones.

    --
    It must have been something you assimilated. . . .
  55. Re: Cost by Anonymous Coward · · Score: 0

    That’s not how infinibsnd works.

  56. Re: Cost by Anonymous Coward · · Score: 0

    The point of this system is not to crunch data, but to serve as a testbed for parallel software development.

    A shared memory machine makes a far better test bed for that purpose.

  57. Re: You mean 75% fewer cores, fewer connects, no R by F.Ultra · · Score: 1

    Emulating the cores would falsify what they are testing since this would reduce a lot of possible race conditions (among other things). Virtualization is nice but it's not an end all solution.

  58. Speed by tigersha · · Score: 3, Interesting

    Purely out of academic interest, how fast is this thing? How does it compete with, say, a 16 core Xeon or Threadripper workstation?

    --
    The dangers of excessive individualism are nothing compared to the oppressiveness of excessive collectivism
    1. Re:Speed by qubezz · · Score: 1

      A Rasberry Pi 3 can do 6GFLOPs, if you can keep it cool enough to not immediately start throttling. 6x750= 4.5TFLOPs. A single NVidia GTX 1080Ti does 4TFLOPs double-precision and 11.5TFLOPs single-precision.

      The academic interest is that this actually has 750 separate and independent CPUs and nodes, so one can see how tasks scale and bottleneck. You can't accurately virtualize all these parameters.

  59. Re: You mean 75% fewer cores, fewer connects, no R by Anonymous Coward · · Score: 0

    Emulating the cores would falsify what they are testing since this would reduce a lot of possible race conditions.

    No it wouldn’t. It would make race conditions more likely to trigger.

  60. Re: Cost by Anonymous Coward · · Score: 0

    You do get four threads per core, so 11 chips will suffice to get to the 3000 threads. Next you need to build a delay system to simulate the various interconnect types and disable cache coherence between those cores, or threads as needed. That's what I would imagine it taking, at least. And Occam, lots of Occam.
      A rack system with three servers of four, fully loaded Phi nodes would cost at least third more to buy and the power consumption is probably little higher (max 5-6kW), so this Pi system would be cheaper to buy and to operate (by one point of comparison).

  61. Re: Cost by Anonymous Coward · · Score: 0

    The latencies and bottlenecks of this system will have zero relevance to the production computer. Again making this cluster a complete waste. Better to do it on a shared memory machine.

  62. Re: You mean 75% fewer cores, fewer connects, no R by viperidaenz · · Score: 1

    And running 1.2GHz 4 core STB processors over 10/100 ethernet is going to be similar to clusters of dual socket 3GHz 54 core processors with 25+ Gbps interconnects? (aka Cavium ThunderX2 CPUs. Nobody is planning on building an ARM based supercomputer with only one CPU per node, let alone with IO limited smartphone/tablet/set-top-box oriented SoCs)

  63. Re: Cost by Anonymous Coward · · Score: 0

    Heâ(TM)s absolutely right.

    https://youtu.be/7ffj8SHrbk0

  64. Re: Cost by Anonymous Coward · · Score: 0

    Presumeably

    Dear God!

  65. Re:Obvious unimportant topic by 14erCleaner · · Score: 1, Funny

    This will obviously be used to verify global warming, so it belongs here. Let's argue politics!

    --
    Have you read my blog lately?
  66. Re: Cost by Anonymous Coward · · Score: 0

    Okay... This is what it's supposed to emulate.

    This thing has more than nine hundred thousand processor chips and two petabytes of memory. Current x64 chips are limited to 256 TB (wikipedia) of physical address space; so these chips either [a] have larger than usual physical address space (I doubt), or [b] isn't a shared memory system.

    So, dumbnuts, this isn't a shared memory system. Go read about the Cray XC40. Or even this document -- clearly showing it's a multi-node system with a fast interconnect. (It talks of each node running different OS images, so that means it isn't one shared OS image - which means it isn't shared memory).

    Summary: What evidence do you have that the target system is shared memory? It looks to me like it's non-shared-memory (i.e., message passing); while with an extremely fast interconnect, I'm sure it's still slower than the CPU internal busses. The same is true with this Raspberry Pi - the interconnect (ordinary Ethernet) is still significantly slower than the ARM chip itself; and THAT environment is what's being emulated.

    It doesn't really matter that other architectures could be faster - the GOAL is to replicate how the Cray XC supercomputers work - albeit at a fraction of the performance and price.

  67. Timing interconnects... by Anonymous Coward · · Score: 0

    There are some timing interconnects on the BitScopes which Bruce uses to sync the signals, reduce the processing requirements.
    We've heard him speak about it here.
    Have to get him talking further on that side of it.

  68. Re: Obvious unimportant topic by Buck+Feta · · Score: 2

    You raang?

    --
    I am Audience.
  69. Re: Cost by Anonymous Coward · · Score: 0

    Morons like you have no idea how super computers actually work.

    Funny, I've written actual code for shared use supercomputers like the ones discussed in TFA, and yet you and your ilk are the ones that look dense and naive.

    Many of these machines have an application process and you must demonstrate that you will make efficient usage of your time on the machine. If you're utilization is too high, you run out of time before you get the results and may have to wait a while for your turn. Under utilize or get stuck in some crash, and you can get penalized, usually being told to go wait for time on a smaller machine and to fix your shit before being allowed to apply for time again.

    For research projects dealing with a limited budget of cpu hours from a grant process, "debugging" and optimization is not idiotic, and becomes quite important. It amounts to bureaucracy, but that is necessary at some scales. Calling it idiotic is on par with saying it is idiotic to have budget planning and approval paperwork for spending on a large project. At large enough scales, just winging it and the associated mistakes cost a lot of people time, which adds up to a lot more than an ounce of prevention.

  70. The Raspberry Bi... by Anonymous Coward · · Score: 0

    The first bi-sexual supercomputer cluster?

  71. Re: Cost by Anonymous Coward · · Score: 0

    "He explained how the whole cluster can be bootstrapped from a single Micro SD card plugged into one of the nodes and how its power consumption and cooling requirements are vastly lower than similar scale HPC." [http://cluster.bitscope.com/blog/bitscope-raspberry-pi-cluster-press-conference]

    "Cluster simulations can help to some extent but in many cases real-world issues can intervene to mitigate their effectiveness"
    [http://cluster.bitscope.com/motivation]

  72. Scientist vs Science Advocate by Anonymous Coward · · Score: 0
  73. Re:Cost by coofercat · · Score: 1

    Oh yes, true. Now why on earth didn't the folks Los Alamos think of that!? They must be complete idiots. You should write to them and explain your ideas - they might give you a job as their chief architect, or maybe their Head of Cost Cutting.

  74. The Megaprocessor Still Laughs by tmjva · · Score: 1

    I think this was also on slashdot last year:

    https://robertmcgrath.wordpress.com/tag/the-megaprocessor-laughs-at-your-puny-integrated-circuits-stephen-cass/

    --
    Tracy Johnson
    Old fashioned text games hosted below:
    http://empire.openmpe.com/
    BT
  75. Re:HEY MODERATOR! GO SODOMIZE YOURSELF WITH SAWBLA by KingBenny · · Score: 1

    moderation is for assholes , but ... here you don't get moderated, you get rated ... one might say you seem to be over reacting a bit but thats fine, i divide my days between standard and less bad too, if this is your biggest problem today then it can't be that bad its not like you get money or anything for it, right ? i like the "green computing" approach here btw

    --
    Free speech was meant to be free for all... how can anyone grow up in a nanny state ?
  76. Re: Cost by Anonymous Coward · · Score: 0

    Yes, I'm sure all the folks at Los Alamos are far stupider than you, Anonymous Coward.

  77. Re: You mean 75% fewer cores, fewer connects, no R by F.Ultra · · Score: 1

    They are testing how their software scales to a massive amount of cores. This you cannot do on a single Xeon Phi. The speed and available bandwidth is irrelevant for that, it is of course relevant for other test cases but that is not what they test here.

  78. Re: You mean 75% fewer cores, fewer connects, no R by F.Ultra · · Score: 1

    No because if a single core emulated 10 other cores there will i.e never be a situation where those 10 cores execute an instruction all at the same time. The laws of physics you know.

  79. Re: You mean 75% fewer cores, fewer connects, no R by viperidaenz · · Score: 1

    Someone should invent software for emulating a CPU, that way you could use one machine to emulate many.

    I'd call it a virtual machine.

  80. Re: You mean 75% fewer cores, fewer connects, no R by F.Ultra · · Score: 1

    And you cannot (as of yet) effectively simulate the kind of massive scale out that places like this code for.