Slashdot Mirror


ARM In Supercomputers — 'Get Ready For the Change'

An anonymous reader writes "Commodity ARM CPUs are poised to to replace x86 CPUs in modern supercomputers just as commodity x86 CPUs replaced vector CPUs in early supercomputers. An analysis by the EU Mountblanc Project (PDF) (using Nvidia Tegra 2/3, Samsung Exynos 5 & Intel Core i7 CPUs) highlights the suitability and energy efficiency of ARM-based solutions. They finish off by saying, 'Current limitations [are] due to target market condition — not real technological challenges. ... A whole set of ARM server chips is coming — solving most of the limitations identified.'"

238 comments

  1. IMHO - No thanks. by Anonymous Coward · · Score: 2, Insightful

    PC user, hardcore gamer and programmer here; for me, energy efficiency is a lesser priority than speed in a CPU. Make an ARM CPU compete with an Intel Core i7 2600K, and show me it's overclockable with few issues, and you got my attention.

    1. Re:IMHO - No thanks. by Stoutlimb · · Score: 5, Insightful

      No doubt your CPU would win. But when looking at power/price as well, you'd have to pit your CPU against 50 or so ARM chips in parallel. For some solutions, it may be a far better choice. One size doesn't fit all.

    2. Re:IMHO - No thanks. by Anonymous Coward · · Score: 2, Interesting

      architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
      cycles per useful op, the x86 architecture is a henious pox on the face of the
      earth.

      worse yet, your beloved x86 doesn't even have any source implications, its just
      a useless thing.

    3. Re:IMHO - No thanks. by Anonymous Coward · · Score: 1

      The article is aimed at supercomputers, not commodity PC. You are not the target.

    4. Re:IMHO - No thanks. by Anonymous Coward · · Score: 1

      Then enjoy your Wintel dinosaur.

      Surprising though it may seem to you, the rest of the world will route around you without even noticing.

    5. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0, Flamebait

      Wow, I'm glad you spoke up!

      A lot supercomputers could have been built with the wrong CPUs of you hadn't been here to set everybody straight. The computing world really owes you big time!

      What a close call, hey everybody?

    6. Re:IMHO - No thanks. by c0lo · · Score: 4, Funny

      The article is aimed at supercomputers, not commodity PC. You are not the target.

      While not the target, you'll be collateral damage anyway.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    7. Re:IMHO - No thanks. by king+neckbeard · · Score: 5, Informative

      You aren't operating in the supercomputing market. There, what matters is the how much processing you can get for how much money. You can always buy more chips, and power usage and cooling are both signficant factors. That's why x86 became dominant in that space. It was cheaper to buy a bunch of x86 chips than to buy fewer POWER chips. In terms of computing power, a POWER7 will eat your i7 for breakfast, but they are ungodly expensive.

      --
      This is my signature. There are many like it, but this one is mine.
    8. Re:IMHO - No thanks. by Colonel+Korn · · Score: 5, Informative

      architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
      cycles per useful op, the x86 architecture is a henious pox on the face of the
      earth.

      worse yet, your beloved x86 doesn't even have any source implications, its just
      a useless thing.

      In TFA's slides 10 and 11, Intel i7 chips are shown to be more efficient in terms of performance per watt than ARM chips. However, they're close to each other and Intel's prices are significantly higher.

      --
      "I zero-index my hamsters" - Willtor (147206)
    9. Re:IMHO - No thanks. by arbiter1 · · Score: 0

      50 arm cpu's eh, problem comes to fact of something that can scale to that many cpu's.

    10. Re:IMHO - No thanks. by dbIII · · Score: 3, Interesting

      Then you use something else as well. High performance computing server rooms already have a mix of stuff, especially since the AMD chips can give you a 64 core machine with half a terabyte of memory for $14K but it's not as fast per core as the two way Xeons. The parallel stuff is done on the plentiful and slower cores while the single treaded stuff is done on the faster cores - then GPUs do whatever parallel stuff you can feed them (memory and bandwidth limiting issues keep them from doing some tasks)

    11. Re:IMHO - No thanks. by dbIII · · Score: 1

      It was a two week process to attempt to buy a single low end machine with one of those things to see if it was viable for a paticular task - two weeks getting my companies wallet weighed by a slimy bastard that made used car salesmen look like saints and a lot of veiled comments that may have been about kickbacks. In the end the price was more than that of four gold plated IBM Xeon systems of similar clockspeed or about double that in whitebox systems. Sounds like you need a black budget immune from the eyes of accountants to buy one of the things.

    12. Re:IMHO - No thanks. by KiloByte · · Score: 4, Interesting

      Damage or a winner? I feel so bad about having a cheap, efficient, and above all, quiet box.

      I bought this 4*2GHz baby, and the only reason it's not my main desktop yet is a weird and asinine requirement for monitor resolution to be exactly 720 or 1080 (WTF?!?). I think I'll replace my old but perfectly working pair of 1280x1024 monitors (I hate 16x9!), and put the big loud clunker to the cellar. I just hate the noise so much. x86 machines with no moving parts are extremely hard to get, and have terrible performance/price. Anything that requires lots of processing power: compilation, running Windows VMs, etc, can be done remotely from the cellar just as well, while a 2GHz arm is fast enough to do client stuff, running a browser being the most demanding part.

      And what else do you need to reside directly on the machine you plop your butt at?

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    13. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      You should like a hardcore gamer but not a hardcore programmer. So long as you can use parallelization for a task, performance per watt for a chip is more important than raw horsepower per cpu.

    14. Re:IMHO - No thanks. by XaXXon · · Score: 1

      Why did you even say this? "PC users" aren't even mentioned in this article. This article is about supercomputers where the workloads are by virtual definition extremely parallel and the restrictions are around price and power consumption, not "FPS on a single game".

    15. Re:IMHO - No thanks. by crutchy · · Score: 1

      Most PC users depend on parallel computing in ways they can't even imagine

      what do you think goes on at the other end of the copper/fibre cable?

    16. Re:IMHO - No thanks. by symbolset · · Score: 4, Interesting

      The problem you have is the software tools you use sap the power of the hardware. Windows is engineered to consume cycles to drive their need for recurrent license fees. Try a different OS that doesn't have this handicap and you'll find the full power of the equipment is available.

      --
      Help stamp out iliturcy.
    17. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      cores and cpu's are not the same thing just so you know

    18. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      To a programmer the distinction is irrelevant.

    19. Re:IMHO - No thanks. by hi-endian · · Score: 1

      Not really sure how your personal needs are at all relevant in this situation, as this post is about servers and supercomputers (ie computers that typically deal with highly parallelized tasks), not about home gaming rigs.

    20. Re:IMHO - No thanks. by 0123456 · · Score: 2

      I feel so bad about having a cheap, efficient, and above all, quiet box.

      So do I. I can't even hear my i7 machine when playing games on it, whereas the old Pentium-4 sounded like a vacuum cleaner.

    21. Re:IMHO - No thanks. by Dcnjoe60 · · Score: 2

      50 arm cpu's eh, problem comes to fact of something that can scale to that many cpu's.

      Well the article is about arms being used in supercomputers, so scalability is probably not going to be a problem.

    22. Re:IMHO - No thanks. by LordLimecat · · Score: 2

      THe core i7 might very well still win. Remember that intel is more efficient in computing work per watt, and an Ivy Bridge core i7 3770k uses 77w. If your average arm chip uses 2 watts, that means that ~30 arm chips will still get beaten by the core i7....

    23. Re:IMHO - No thanks. by c0lo · · Score: 1

      If it's the OP AC, whinging about how his games don't work well on ARM - then it's a damage (not that I regret it).
      If it's you (thanks for the link: nice to see others on top of RasPi) or me - then its winning.

      Speaking about quiet: I recently bough a Proliant Microserver for the "home FS"/NAS - at 15W for the Turion and the 4 NAS grade WD HDDes... mums, I can't hear it (under 60W at peak use). I would have gone with a ARM-board, but could't find enough support for NAS-ing (not when RAID-ing anyway).

      btw: I don't have a cellar... yet. When I'll have one, 't'll be for wine only... ummm... maybe a bit of mead as well.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    24. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      For mobile solutions it fits. For supercomputers? Battery life isn't a term. The largest expense over the lifetime of ANY "super computer" worthy of the term is going to be energy consumption (including cooling). Until ARM can match Intel/IBM/etc. on that there's no great logical argument for their use therein, though other uses (if lower initial cost is more important for the application) would warrant another discussion.

      As for performance per watt, ARM is certainly catching up to Intel and x86 relatively. But you'll notice the researchers use the absolute newest Arm Cortex a15 (28nm process) architecture for ARM while going back a full 2 years for Intel's Sandy Bridge to compare performance per watt. Ignoring the soon the be released Intel Haswell architecture could be due to timing, but if they feel free to compare mobile parts why deliberately ignore Intel's mobile release of Ivy Bridge (22nm)? A platform for which performance per watt was shown to have improved a significant amount. Without this comparison the analysis is incomplete and thus fundamentally flawed.

      Not that arguments against ARM being able to match up to the requirements of data centers and super computers isn't without merit, while the Cortex A15 is probably not the design to get ARM into that space, their upcoming 64 bit server/datacenter/etc. oriented architecture may stand a much better chance. Either way the engineering battle between Intel and ARM is at the very least good for consumers no matter who is currently "winning". As for anyone arguing thatx86 is old and will be replaced by ARM "Because ARM is more efficient!" I'm sure there are others much more qualified for a very logical rebuttal, but instead I'll just say actual data such as Intel's upcoming Silvermont architecture can speak for itself: http://www.anandtech.com/show/6936/intels-silvermont-architecture-revealed-getting-serious-about-mobile

    25. Re:IMHO - No thanks. by MichaelSmith · · Score: 1

      With sufficient abstraction.

    26. Re:IMHO - No thanks. by aztracker1 · · Score: 4, Insightful

      Exactly, then again, there are plenty of non-cpu intensive loads.. part of the popularity and growth of NodeJS is that a lot of jobs are IO bound, and even a lot of web services/sites are spending most of their time waiting on files, or network resources/services... 10 arm CPU's handling 10K simultaneous requests, is as good as 1 uber-cpu handling 10K simultaneous requests... for that matter, there's been a lot of work done in MessageQueue routing, and distributed databases... ARM is a pretty good fit for an environment designed to scale horizontally. Some of the first things I wanted to try on my Raspberry Pi were MongoDB and NodeJS, with the thought that a couple dozen of them might work better with more resilience than a few larger systems...

      For the record, I think addressing a bit more memory, and larger/faster storage channels are what's holding back some of these systems.. which aren't a problem at super-computer scale.. but for someone wanting to put together a small cluster, it gets irritating.

      --
      Michael J. Ryan - tracker1.info
    27. Re:IMHO - No thanks. by Redmancometh · · Score: 2

      Useless for what you do. The second performance...not performance per watt...PERFORMANCE becomes an issue..ARM is a steaming pile of shit and you know it. If you're doing anything more than what the above AC said (keep playing soduku, and portal) it can't handle it. How about everyday consumers who need a tablet that can actually do work? A gimp version of windows is not going to get the job done. Some of the Samsung Slate tablets however come with an x86...and are actually fully functional! Can you point to an ARM tablet that can do everything it can? Or any other x86 tablet for that matter?

      I know it's not about the software. However, unfortunately, sometimes raw productivity is all that matters. Sometimes the latest windows RT garbage dump or iOS xyz isn't going to hold water. The fact of the matter is the software that will run on a system defines how productive that device is going to be. Me and you might be able to put a proper operating system on one of these...but your whole company? Hell no.

    28. Re:IMHO - No thanks. by Redmancometh · · Score: 1

      That first sentence was supposed to be posted on another article...but you can't edit or delete on slashdot which is pretty awful.

    29. Re:IMHO - No thanks. by aztracker1 · · Score: 3, Informative

      The last two times I ran Linux on my desktop I ran into issues that weren't impossible to overcome, just a pain in the ass to deal with... I had a desktop with two graphics cards in sli, and two monitors.. getting them both working in 2006 was a pain, I know that was seven years ago, but still... far harder than it should have been.. in 2007, my laptop was running fine, upgraded to the latest ubuntu, nothing but problems.. In the first case, XP/Vista were less trouble, in the second, Win7 RC1 ran better... I also ran PC-BSD for a month, which was probably the nicest experience I've had with something outside win/osx on my main desktop, but still had issues with virtual machines that was a no-go.

      Given, my experiences are pretty dated, and things have gotten better... for me, linux is on the server(s) or in a virtual machine... every time I've tried to make it my primary OS has been met with heartache and pain. I replaced my main desktop a couple months ago, and tried a few Linux variants.. The first time, I installed on my SSD, then when I plugged in my other hard drives, it still booted, but an update to Grub screwed things up and it wouldn't boot any longer. This was after 3 hours of time to get my displays working properly.... I wasn't willing to spend another day on the issue, so back to Windows I went. I really like Linux.. and I want to make it my primary desktop, but I don't have extra hours and days to tinker with problems an over-the-wire update causes... let alone the initial setup time which I really felt was unreasonable.

      I've considered putting it as my primary on my macbook, but similar to windows, the environment pretty much works out of the box, and brew takes things a long way towards how I want it to work. Linux is close to 20 years old.. and still seems to be more crusty for desktop users than windows was a decade and a half ago in a lot of ways. In the end, I think Android may be a better desktop interface than what's currently on offer from most of the desktop bases in the Linux community, which is just plain sad... I really hope something good comes out of it all, I don't like being tethered to Windows or OSX... I don't like the constraints... but they work, with far fewer issues... the biggest ones being security related... I think that Windows is getting secure faster than Linux is getting friendlier, or at least easier to get up and running with.

      --
      Michael J. Ryan - tracker1.info
    30. Re:IMHO - No thanks. by 0123456 · · Score: 2

      I had a desktop with two graphics cards in sli, and two monitors

      Given SLI barely works in Windows, expecting it to work in Linux was optimistic. I recently booted up a Linux Mint DVD on my laptop to try it out and... everything just works. Even using the 'recovery partition' to reinstall Windows on there takes over three hours, reboots about thirty times and breaks with barely decipherable and completely misleading error messages if you installed a hard drive larger than the one that came with it.

       

      Linux is close to 20 years old..

      And the BSD core in MacOS is close to 40 years old.

      Android would make a lousy desktop interface, just like Window 8. It was designed for phones and is barely a usable tablet interface. Of course, it probably is more usable than Gnome 3.

    31. Re:IMHO - No thanks. by Anonymous Coward · · Score: 1

      Eh, not really, it depends on the workload of course. Sun/Fujutsu and HP have been doing 64-socket systems for a long time now, SGI used to do it too. If you're talking cores and threads, Sun/Oracle and IBM have been making systems in the 64-core and 512-thread rage (on 4 sockets), and Oracle pumped out 8-socket 1024-thread beasts earlier this year, for a decade or so.The trick is finding every day workloads that benefit from that kind of paralellization, as well as needing that level up upward scaling.

      I don't see these kinds of systems kicking off in the consumer market in the immediate future, if only because the majority of workloads don't benefit all that much from it. It's a no-brainer in HPC, and can stand to finally serve as competition to Sparc and Power in the highest tiers of the enterprise, if single-threaded performance (and other considerations, such as the multitude of special-purpose co-processors available on these systems) can be brought up to par.

    32. Re:IMHO - No thanks. by Technician · · Score: 1

      Is it worth the wait for the next gen of low power chips to arrive?

      --
      The truth shall set you free!
    33. Re:IMHO - No thanks. by Anonymous Coward · · Score: 1

      No, it is not. When working with NUMA you will have to think about how you move threads between cores or cpus. Moving threads between cpus will require you to move cache as well, and the performance impact can be quite dramatic.

    34. Re:IMHO - No thanks. by Khyber · · Score: 2

      "For supercomputers? Battery life isn't a term."

      You say that until the power grid fails and your generator fails to kick on, leaving you with only battery backup in place.

      --
      Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
    35. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      . I just hate the noise so much. x86 machines with no moving parts are extremely hard to get, and have terrible performance/price.

      That's not entirely true, you just gotta look in the "business-class" offerings for quiet running desktops. Generally the expandibility makes up for many if not all shortcommings.I'm quite fond of Lenovo's offerings, unless I'm going heavy on the graphics processing, I can barely hear a thing ()I say barely, because I'm not using SSDs). Maybe I got lucky and found a system with an AM2+ board that wasn't locked to AM2, even after putting a hexacore Phenom in there, it still runs quiet - until I start making the GPU cry, that is.

      I'm a graphic artist and musician though, so I kinda do need to have horsepower in the machine I'm sitting at, the workflow doesn't lend itself well to working remotely.

    36. Re:IMHO - No thanks. by Anonymous Coward · · Score: 3, Insightful

      A single ARM 4 core A-15 running 1.5 GHz per core blows away any competing chip at the same specs, on power AND price. It's not limited to the calculations x86 are and can process graphics and physics better as a result.

      Translation: It gets raped sideways on single-threaded performance and you have to double up on sockets right out of the gate.
      It's a bit of a misconception about ARM and x86. ARM wins of watts/socket and mhz/watts, but Intel's i7s cream ARM on performance/watt, once you account for those two factors, ARM isn't as competitive as you might think. Now, I'm not saying it isn't competitive, just that it's nowhere near as one-sided as you might be led to believe by cherry-picking.

    37. Re:IMHO - No thanks. by PhamNguyen · · Score: 1

      Got any evidence for that claim? here are some benchmarks that suggest gaming performance is the same (which is what you would expect since the OS isn't participating much, except through the graphics drivers).

    38. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      I had a desktop with two graphics cards in sli, and two monitors.. getting them both working in 2006 was a pain

      Guess what, that's actually gotten worse!

      X.org decided it would be clever to throw out xinerama so now 1 video card is all you get and you need to host a separate X session (i.e. have 2 fully independent desktops that you can't move windows between) in order to utilise a monitor plugged into a second card. I love when things which worked in 2000 are suddenly "the technology just doesn't exist yet" in 2010. [This has always worked in Windows since Win2000 BTW, and still works]

      OTOH, SLi doesn't really do anything worth having in Linux. All SLi does is render one 3D frame on card A then render the second one on card B, rinse repeat. You aren't going to be doing any heavy duty gaming in Linux so the performance benefit of SLi gets you nothing beyond increased power usage.

    39. Re:IMHO - No thanks. by gl4ss · · Score: 2

      No doubt your CPU would win. But when looking at power/price as well, you'd have to pit your CPU against 50 or so ARM chips in parallel. For some solutions, it may be a far better choice. One size doesn't fit all.

      50 costs more in silicon than a single x86.

      basically you need a "new generation" of arm chips. but they'll have to compete against a new generation of x86 chips - and remember, x86 chips are priced as they are only because they're fastest you can buy!.

      the thing is, we have been listening to this for years, that in few years arm will take over everything. yet it hasn't.

      instead of supercomputing, I would foresee the lowest tier of rent-a-webservers to move to arm.. what's a better business than renting a machine that costs 40 bucks total for 5 bucks a month?

      --
      world was created 5 seconds before this post as it is.
    40. Re:IMHO - No thanks. by Bert64 · · Score: 1

      Alpha used to be the fastest you can buy, and it used to be priced high too...
      ARM is doing what x86 did to the highend risc cpus of the 90s.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    41. Re:IMHO - No thanks. by Teun · · Score: 2

      Always :)

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    42. Re:IMHO - No thanks. by BasilBrush · · Score: 3, Interesting

      Why would an ARM chip use 2 Watts?

      â-- ARM Cortex-A9
      â-- 1 ops / cycle @ 800 MHz - 2 GHz
      â-- 0.25 - 1 Watt

      â-- ARM Cortex-A15
      â-- 4 ops / cycle @ 1 - 2.5 GHz*
      â-- 0.35 Watt

    43. Re:IMHO - No thanks. by Teun · · Score: 2

      Maybe you should ask your mom what those Preview and Continue Editing buttons below your fresh commend mean?

      --
      "The likes of Facebook and WhatsApp are free to those whose privacy is of zero value."
    44. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      I for one am happy to see WinTel crumbling at both ends. Windows and X86, each as ugly as the other.

    45. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      Far more games are played on ARM cpus than X86 CPUs these days. Of course the takeover started at the bottom end with Snake, and moved on through Angry Birds etc., it's only a matter of time before ARM takes over the hard core gamers too. It's more a matter of having a platform with big screen and interesting controllers. ARM CPUs are already up to the task of running such systems.

    46. Re:IMHO - No thanks. by cyber-vandal · · Score: 1

      Yeah yeah you had no problems therefore they don't exist. I wish Linux advocates would be more honest about its flaws. I think it's great but it's nowhere near perfect. I swapped a Mint hard drive from another machine into this one and it works flawlessly which Windows most certainly wouldn't, however when I put Ubuntu on that other machine it was a nightmare.

    47. Re:IMHO - No thanks. by BasilBrush · · Score: 3, Informative

      what do you think goes on at the other end of the copper/fibre cable?

      No supercomputing whatsoever. I'm not a physicist, a mathematician, a code breaker nor anyone else with supercomputing needs. My HTTP request for web page is quite likely served by a single core. Maybe 2.

    48. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Looks like a shill post to me. Naughty Intel!

    49. Re:IMHO - No thanks. by unixisc · · Score: 2

      Alpha's high price was due to DEC trying too hard to achieve prized speeds, and thereby having plenty of fallout, resulting in their need to jack up prices on those that did pass their tests. Had DEC gone for different speed bins, instead of just one, they could have priced it lower and sold it to markets which would have happily considered an Alpha, but where price was less critical.

    50. Re:IMHO - No thanks. by Rockoon · · Score: 1

      ..and by ugly you mean the greatest (most versatile) addressing modes of any currently produced CPU's?

      The x86 addressing modes are so powerful that they even created an instruction to leverage the addressing generation logic without accessing memory...

      The fact is that neither RISC nor CISC is best, that a hybrid of the two is best. The problem with the RISC camp is that they cant make it hybrid while still being RISC, while the CISC camp hybridized long ago and even remained entirely compatible while doing it.

      --
      "His name was James Damore."
    51. Re:IMHO - No thanks. by ceoyoyo · · Score: 1

      Those are generally the problems people run on massively parallel supercomputers.

    52. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      There is another end?

    53. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Good thing nobody uses cancerous NodeJS at the supercomputer scale. In fact literally nobody uses it at scale at all.

    54. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      You sound like an idiot.

    55. Re:IMHO - No thanks. by gl4ss · · Score: 1

      ..if it runs x86 native, isn't it a x86 cpu?
      you look like an idiot who read some hype up article a few years back and is still waiting for it to be true. keep waiting! like for the magic parallel!(plenty of games utilize parallel code nowadays)

      --
      world was created 5 seconds before this post as it is.
    56. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      The trick is finding every day workloads that benefit from that kind of paralellization, as well as needing that level up upward scaling.

      I guess you haven't been paying attention to virtualization. Typical web server these days is not a physical server, it is a VM running on the same system as 100 other VMs. Same for more and more core business systems. So many advantages. I can run an obsolete appp from 1995 on a VM when it won't run on any current physical hardware.

    57. Re:IMHO - No thanks. by AchilleTalon · · Score: 1

      Your comment is off-topic. Nobody cares about your gaming machine and your desktop. Have you read the article? It is about HPC, you know these machines which are simulating global warming, nuclear weapons, etc. It is talking about entire rooms filled with dense compact racks of CPUs and memory and these are having a super high electricity bill to pay each month and they actually care about energy efficiency which may mean more processing power for the same price. Overclocking your gaming machine isn't HPC.

      --
      Achille Talon
      Hop!
    58. Re:IMHO - No thanks. by jedidiah · · Score: 1

      No. Alpha anything was priced insanely.

      There have always been cheap x86. It's only the extreme high end that's been rediculous. There has always been a sweet spot with x86 in terms of price and performance.

      Although Alpha does provide a nice example of how performance per core trumps anything else. There were some problems you simply could not solve by throwing lesser CPUs at it no matter how much you might have wanted.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    59. Re:IMHO - No thanks. by jedidiah · · Score: 1

      Quiet low profile PCs are rediculously easy to get. PCs have been shrinking in size for years and they were some of the earliest machines to come in a low profile form factor. Ironically enough, this category includes a lot of machines intended for office use.

      If you can't find a quiet powerful PC you just aren't looking very hard.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    60. Re:IMHO - No thanks. by jedidiah · · Score: 1

      I wish Lemming trolls would be more honest about Windows flaws and how Linux really stacks up against it once you stop trying to pretend that Windows is something that it really isn't.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    61. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Alternatively, one person with an unusable unsupported PC claims Linux ain't ready. Works both ways, buddy.

    62. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      I mean ugly in exactly the same way as Windows. Inelegant. Crap piled upon shit. Beauty is not made by adding features.

      It's not a RISC vs CISC comment. It's specifically the x86 lineage that I'm referring to as ugly. It was the first processor family that I didn't want to learn assembler for, and for 30 years since then I've continued to avoid it. Though I learned assembler for other architectures in that time.

      6502 was beautiful, as was 6809, 68000, ARM, PowerPC and Propeller. All in their own ways. X86 is ugly.

    63. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Don't forget Intels new haswell line either. I'd imagine that Intel will end up very energy competitive while remaining very computationally efficient, something that I don't see arm having a how in hell of matching our even keeping up with.

    64. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Yes, because when building super computers one must always consider the needs of gamers. You might just as well be decrying the low cargo capacity of rollerskates. I don't think you understand the topic under discussion.

    65. Re:IMHO - No thanks. by hairyfeet · · Score: 1

      Exactly, I really don't get this trying to shoehorn ARM into places where it just don't make any damned sense, its just as stupid as how they tried pushing ActiveX "cloud apps" a few years back for every damned thing and it was retarded then and retarded now.

      Does that mean ARM doesn't have a place? of course not, I could easily see new hybrid servers that have an ARM chip for when its not doing much and then hand it off to the X86 when it needs the power but lets face it, the IPC of ARM is just piss poor, it really is. To get ARM up to the level of an 8 year old Core based Xeon or AMD Opteron causes it to blow its power budget all to shit, to use a /. car analogy it would be like buying a Kia for the gas mileage and then rebuilding the thing to haul boats, you've just destroyed the reason for buying the Kia in the first place by forcing it to do a job it just isn't good at.

      Lets face it folks, that 6 year old C2Q or Phenom X4 just curbstomps the latest and great ARM chips when it comes to IPC and I don't see that changing in the future, you can have low power or high performance, NOT both.

      --
      ACs don't waste your time replying, your posts are never seen by me.
    66. Re:IMHO - No thanks. by K.+S.+Kyosuke · · Score: 1

      When working with NUMA you will have to think about how you move threads between cores or cpus.

      Shouldn't a good OS take care of that for you? Just like the paging mechanism takes care of moving pages between slower and faster storage?

      --
      Ezekiel 23:20
    67. Re:IMHO - No thanks. by K.+S.+Kyosuke · · Score: 1

      Well, that probably has a lot to do with their manufacturing technology rather then with them having a completely unbeatable architecture.

      --
      Ezekiel 23:20
    68. Re:IMHO - No thanks. by socode · · Score: 1

      In some cases, this amounts to using VMs to model poor deployment and versioning processes.

      The obsolete app approach will also run out of road - for businesses that are required to run on e.g. only supported OS versions, the VM approach would only buy an extra 2-3 years (i.e. when you can't buy new hardware which ships with $OS). Since this will apply to a lot of larger enterprises, they will be likely to apply it to their hosting (or even service) providers.

    69. Re:IMHO - No thanks. by olip85 · · Score: 1

      One size doesn't fit all.

      You've never been in the army I suppose? ;-)

    70. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      > us chock full of errors

    71. Re:IMHO - No thanks. by colinrichardday · · Score: 1

      Well that would be one way to save power. :-)

    72. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Oh give me a break, what a fucking liar. Go get your facebook/twitter/dopamine fix. I haven't seen any distributions fail like you said since the 90s you a troll and a fucking liar.

    73. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Linux is close to 20 years old..

      And the BSD core in MacOS is close to 40 years old.

      Android would make a lousy desktop interface, just like Window 8. It was designed for phones and is barely a usable tablet interface. Of course, it probably is more usable than Gnome 3.

      The MacOS X kernel is based on Mach which was started in 1985 at Carnegie Mellon. Yes, you did say "BSD core" but Linux is a kernel. So compare apples to apples (pun inteded) please.

    74. Re:IMHO - No thanks. by EyeSavant · · Score: 1

      Well scaling is always a problem, but not at the 50 CPU level. In the basement at work we have a machine wtih 30,000 cores and running on that many is definately a scaling problem.

      Cores are not getting any faster though (due to power footprint mainly), so scaling is a problem that is going to come to everyone sooner or later.

    75. Re:IMHO - No thanks. by LordVader717 · · Score: 1

      Supercomputers are meant to be scalable, so it doesn't really matter what the computing power of an arbitrary "CPU" is (which consists of an equally arbitrary number of cores all running at a somewhat arbitrary clock rate).
      What does however influence design decisions are
      1) cost
      2) power consumption
      If an ARM machine has an advantage on these compared to an equally powerful x86 cluster then that makes it better.

    76. Re:IMHO - No thanks. by Dcnjoe60 · · Score: 1

      Well scaling is always a problem, but not at the 50 CPU level. In the basement at work we have a machine wtih 30,000 cores and running on that many is definately a scaling problem.

      Cores are not getting any faster though (due to power footprint mainly), so scaling is a problem that is going to come to everyone sooner or later.

      I agree, I was referring to 50 cores being a problem, which is what the OP was posting about. At 30,000 cores, scaling will be a problem regardless of whether it is based on arm or not!

    77. Re:IMHO - No thanks. by LordVader717 · · Score: 1

      Just like to point out that the Nintendo 3DS and Playstation Vita, two "hardcore" systems, use ARM CPUs.

    78. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Nobody cares as long the performance is there.

    79. Re:IMHO - No thanks. by White+Flame · · Score: 1

      50 costs more in silicon than a single x86.

      Sure in the one-time silicon expense, but not in TCO with power & cooling accounted for, over thousands of racks over years.

      x86 chips are priced as they are only because they're fastest you can buy!

      I'ts not about speed. It's about operations per watt. (Usually FLOPS/W, but not necessarily.) The competition is quite close in this regard.

    80. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      3 16:9 monitors in portrait mode is your solution. 1080 is quite a nice width if you work with text a lot. You will need more than one if you go this route though, 2 will do but 3 is nicer because you'll want to put tool windows on one and the editor on the other.

    81. Re:IMHO - No thanks. by crutchy · · Score: 1

      lucky i didn't say supercomputing then or else i'd have really looked like a fool

    82. Re:IMHO - No thanks. by crutchy · · Score: 1

      it's full of tubes

    83. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      The story is about supercomputing. Parallel computing, so what?

      You don't need to go as far as the end of the copper/fibre. As I said it's quite likely a HTTP request is serviced by a single core. So it's quite likely your mobile phone is using more cores than the other end of the connection.

    84. Re:IMHO - No thanks. by dkf · · Score: 1

      lucky i didn't say supercomputing then or else i'd have really looked like a fool

      Yes, but it was still precisely irrelevant to an article on supercomputing. The requirements there are really quite different to both home/office computing and also to general servers. For one thing, supercomputing is far more likely to be CPU-bound or interconnect-bound than the other cases I've just listed, which are virtually always either external-IO-bound or (if you're unlucky) memory-bound. This gives rise to a very different set of challenges: in particular, heat management is a massive issue because supercomputers are always packed as tightly together as possible to minimize interconnect communication delays. (Damn you, speed of light!)

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    85. Re:IMHO - No thanks. by dkf · · Score: 1

      You say that until the power grid fails and your generator fails to kick on, leaving you with only battery backup in place.

      That's why you have a plan of regular testing for your generator. You do have one of those and follow it, yes? (And remember to keep it fueled too, lest you be visited by the facepalm moment of clarity...)

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    86. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      ARM Cortex-B53

      27 Watt

    87. Re:IMHO - No thanks. by Anonymous Coward · · Score: 0

      Yup, I'm still running a 6502... one of these days the next gen will be good enough for me!

      captcha: postpone!

    88. Re:IMHO - No thanks. by TheSkepticalOptimist · · Score: 1

      How can you claim this is insightful?

      This is talking about server products, not your beloved I7 PC.

      Supercomputers are not about the performance of a single CPU, but the aggregate performance of a slew of CPU's. And considering the desire to reduce the power consumption of Data Centers, any time you can get real gains in power reduction means real gains in profit.

      Not only that but its pretty clear the average PC user was happy to switch from mainstream Intel to ARM based tablets and phones. I agree its a step back in the evolution of computers, but when a billion people decided that a tablet is "good enough" for them, the market listens. Even Intel is struggling now to make a CPU with the performance/power ratio of ARM rather than worrying about kick-ass home CPU's for the few million PC DIY'ers left in the world.

      --
      I haven't thought of anything clever to put here, but then again most of you haven't either.
    89. Re:IMHO - No thanks. by ChrisMaple · · Score: 1

      WBZ in Boston had a regular program of testing their backup generator. When the Great Blackout of 1965 came along, the generator started up automatically and ran just as planned -- for about a minute. They'd used up all their fuel in test runs.

      --
      Contribute to civilization: ari.aynrand.org/donate
    90. Re:IMHO - No thanks. by crutchy · · Score: 1

      i guess if you look at it like that then each computation performed on a supercomputer is only using a single core too

    91. Re:IMHO - No thanks. by crutchy · · Score: 1

      my comment was relevant to the post i was replying to, which was trying to say that home computing has nothing to do with parallel workloads, which is wrong

      and every application is IO-bound, including on supercomputers... maybe if you're just crunching the value of pi to the millionth decimal place you may not need IO, but if you're trying to do anything useful (like crunching data from SETI or the human genome project or trying to predict the weather), you're going to need real data. that's why things like SANs were created.

      maybe ask google how parallel their search engine operation is

    92. Re:IMHO - No thanks. by tolkienfan · · Score: 1

      OSes have algorithms that perform well, or at least reasonably, in the general purpose case.
      Optimizing for a specific case can save a lot of cycles. When those cycles are multiplied, as they often are in HPC, it can result in a huge performance gain.
      This goes for networking, concurrency, memory allocation, and numa accesses.
      Actually, current OSes have next to nothing to base numa decisions on, so they do a horrible job.

    93. Re:IMHO - No thanks. by tolkienfan · · Score: 2

      A single Sandy Bridge system will outperform many dozen Raspberry PIs.

    94. Re:IMHO - No thanks. by cyber-vandal · · Score: 1

      The only time I mentioned Windows in that post was to criticise it.

    95. Re:IMHO - No thanks. by tzot · · Score: 1

      > The x86 addressing modes are so powerful that they even created an instruction to leverage the addressing generation logic without accessing memory...

      I don't know which instruction they created, but it sounds dangerously close to the LEA instruction that existed on the 68008 my Sinclair QL was using, and it was neither Motorola nor Intel that originally came up with this idea.

      BTW: 68k assembly was a pleasure to write.

      --
      I speak England very best
    96. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      Only if you have a very personal definition of "computation" that defines it as being something that can only be done on one core.

      If you'd said operation, then I'd have agreed with you.

      None of which is relevant to the topic of supercomputing being a very different thing from client computers (or servers) running a conventional OS with a handful of cores.

    97. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      my comment was relevant to the post i was replying to, which was trying to say that home computing has nothing to do with parallel workloads, which is wrong

      No you didn't reply to any such post. The post you originally replied to was trying to make the distinction that PCs are not supercomputers. You thought the fact that there's some degree of parallel processing (at the server of all places) meant that it was supercomputing. Which means you didn't understand that supercomputing, and it seems you still don't.

      and every application is IO-bound, including on supercomputers... maybe if you're just crunching the value of pi to the millionth decimal place you may not need IO, but if you're trying to do anything useful (like crunching data from SETI or the human genome project or trying to predict the weather), you're going to need real data. that's why things like SANs were created.

      No, supercomputers are usually not IO bound, but bound by the interconnect speed between the processors. And it's very likely that they are not doing much if any IO whilst operating. They are quite likely to hold all the data locally.

      You mention SETI, which is more distributed computing than supercomputing, and uses slow internet connections between devices, yet it spends far more time analysing the data packets than pulling them off the network.

    98. Re:IMHO - No thanks. by K.+S.+Kyosuke · · Score: 1

      Actually, current OSes have next to nothing to base numa decisions on, so they do a horrible job.

      But I thought that NUMA is non-uniform with regards to the physical addresses, i.e., the OS should be able to do something about it. Couldn't an OS transparently move a piece of data from one physical page to another (nearer to the process that frequently needs it) and then just transparently update the page mapping data, without any user process noticing? Also, one idea that comes to my mind is a simple syscall that would allow you "prefetch" the data closer, if you have multiple nodes working on a shared large data structure (a large matrix, perhaps?). Basically, it would allow you to say: "Dear kernel, do you see this 1MB of data beginning at 0x5dcbb00000? I'll be working with it extensively 20ms from now, could you please move it a bit closer to my core while I'm doing something else?" I'm not a HW or HPC guy, but it seems to me a bit weird that this should be a big problem for system designers.

      --
      Ezekiel 23:20
    99. Re:IMHO - No thanks. by tolkienfan · · Score: 1

      Linux does have a function to allow pages to be moved on demand, but clearly that's up to the application, and no longer an automated OS facility, and my point still stands.

      Moving pages isn't very cheap, so doing so too often will do much more harm than good.

      But the worst part is many (most?) pages are shared, either among threads or processes. The best performance comes from specializing - I.e. not relying on the os.

    100. Re:IMHO - No thanks. by LordLimecat · · Score: 1

      So if the i7 is more energy efficient per-workload, that makes it even more onesided.

      As I recall, the fight gets even more brutal when doing things like AES or anything optimized with an instruction set, or anything bound by memory bandwidth.

      http://www.phoronix.com/scan.php?page=article&item=samsung_exynos5_dual&num=6
      Thats the first-gen Core series vs an A15. Note that it completes its work in 1/3 the time. Ivy bridge is ~30% faster for the same parts, and uses less power (~1/2), and thats not a particularly optimal processor either. Im not sure if anyone has ever compared one of the low-power i7s or Xeons (ie, E3 1220Lv2) to ARM, but I imagine it wouldnt be pretty.

    101. Re:IMHO - No thanks. by Redmancometh · · Score: 1

      I'm egocentric, so I assume my posts are perfect without looking them over.

    102. Re:IMHO - No thanks. by crutchy · · Score: 1

      The post you originally replied to was trying to make the distinction that PCs are not supercomputers. You thought the fact that there's some degree of parallel processing (at the server of all places) meant that it was supercomputing. Which means you didn't understand that supercomputing, and it seems you still don't.

      from the original post i was replying to...

       

      "PC users" aren't even mentioned in this article. This article is about supercomputers where the workloads are by virtual definition extremely parallel

      my reply was to highlight that PC users rely on parallel computing (not on their PC but at the other end)

      i know you can't help being a dipshit, but you don't have to make it so obvious

      They are quite likely to hold all the data locally

      hahahahahahaha!!!!!

      so your supercomputer cpu holds all the data inside the cpu... what a fucking moron

      and SETI is supercomputing... the only difference between BOINC and Blue Gene is that the wires are longer

      i pity your level of ignorance

    103. Re:IMHO - No thanks. by crutchy · · Score: 1

      yeah like i said... many people (including you apparently) have no idea what goes on at the other end

    104. Re:IMHO - No thanks. by crutchy · · Score: 1

      so are you still trying to claim that my mobile phone has more computational power than google's search infrastructure?

      surely you aren't as stupid as you sound

    105. Re:IMHO - No thanks. by BasilBrush · · Score: 1

      You're already displayed your lack of knowledge of computing. You're now showing your lack of maturity.

    106. Re:IMHO - No thanks. by crutchy · · Score: 1

      sore loser

  2. Slashvertisement much by Anonymous Coward · · Score: 0

    Really, Soulskill?

    1. Re:Slashvertisement much by crutchy · · Score: 1

      so you don't like a nerdy FA linked on a site that markets itself as 'news for nerds'?

      maybe if you don't like any form of online activity that could possibly be construed as advertising you're better off disconnecting your internet altogether

      after all, you used the word 'really' in your post, which contains 'real', so obviously you're astroturfing for real networks... shill much?

  3. Early supercomputers by Anonymous Coward · · Score: 0

    Like the CDC6600?

  4. Not buying it. by Anonymous Coward · · Score: 1

    Power/performance ratios are with x86.

    1. Re:Not buying it. by symbolset · · Score: 0

      This is easy to say but all the top supercomputers are GPGPU based now. The CPU is a management appliance that dishes the computables to the compute cores.

      --
      Help stamp out iliturcy.
    2. Re:Not buying it. by dbIII · · Score: 1

      It depends entirely on the task. There's plenty of threads that just cannot fit their memory requirements onto a GPU and keeping the things fed with memory can be slower than doing it on a CPU in the first place. Remember you are comparing something of the order of 8GB shared memory between the GPU cores with 1TB shared between the CPU cores.

    3. Re:Not buying it. by MikeBabcock · · Score: 3, Informative

      I don't buy your response: http://top500.org/statistics/list/ ... click accelerator and hit submit.

      87.6% of the top 500 super computers have no NVIDIA etc. coprocessing

      --
      - Michael T. Babcock (Yes, I blog)
    4. Re:Not buying it. by symbolset · · Score: 1

      I'm thinking you don't understand. The whole "shared memory" thing is not exclusive to x86 cores. At some level it's a software abstraction relating to latency of storage. GPUs can have terabytes of RAM too as a sixth level cache.

      Intel really needs some help here because the ground has shifted too much for them.

      --
      Help stamp out iliturcy.
    5. Re:Not buying it. by symbolset · · Score: 1

      OK, fine. Pretend this isn't happening and see how that works out for you.

      --
      Help stamp out iliturcy.
    6. Re:Not buying it. by gronnsak · · Score: 1

      Sure, but the super computers using accelerators represents 23% of list performance. In addition the usage of accelerators tripled from June 2011 to November 2012, so there seems to be a trend here.

    7. Re:Not buying it. by Anonymous Coward · · Score: 0

      Except for the Xeon Phi based ones.

    8. Re:Not buying it. by Anonymous Coward · · Score: 0

      *laugh* Yeah. No GPU manufacturer makes enough money off of HPC to do development. it's all subsidized by the gaming GPUs, and with all of them moving to mobile, it's not clear how this is going to continue in the future. Besides, very few HPC applications (relative to the number being run) have been re-written for GPUs. Intel's Phi is going to eat them alive simply because it is so much easier to port MPI/OpenMP code.

    9. Re:Not buying it. by Kaldaien · · Score: 1

      Yeah, it's definitely not the norm at the moment. However, GPGPU is gaining traction very fast.

      It really comes down to the application, algorithms still have to be re-tooled to function optimally on stream processor architectures, and the languages that have cropped up around the new hardware (e.g. CUDA / OpenCL) introduce challenges to legacy software as well. Some applications will never benefit from billions of simple simultaneous threads, as much as they would fewer, more capable hardware threads.

      We'll probably see a mix of all three architectures in the coming years. You have the pick the right tool for the job, and the jobs are just as diverse as the hardware we're discussing.

  5. Does it really matter? by gman003 · · Score: 4, Interesting

    Most of the actual processing power in current supercomputers comes from GPUs, not CPUs. There are exceptions (that all-SPARC Japanese one, or a few Cell-based ones), but they're just that, exceptions.

    So sure, replace the Xeons and Opterons with Cortex-A15s. Doesn't really change much.

    What might be interesting is a GPU-heavy SoC - some light CPU cores on the die of a supercomputer-class GPU. I have heard Nvidia is working on such (using Tegra CPUs and Tesla GPUs), and I would not be surprised if AMD is as well, although they'd be using one of their x86 cores for it (probably Bulldozer - damn thing was practically built for heavily-virtualized servers, not much different from supercomputers).

    1. Re:Does it really matter? by Victor+Liu · · Score: 5, Informative

      As someone who does heavy duty scientific computing, I wouldn't say that "most" of the actual process power is in GPUs. They are certainly more powerful at certain tasks, but most applications run are legacy code, and most algorithms require substantial reworking to get them to run with reasonable performance on a GPU. Simply put, GPU for supercomputing is not quite a mature technology yet. I am personally not too interested in coding for GPUs simply because the code is not portable enough yet, and by the time the technology might be mature, there might be a new wave of technology (like ARM) that could be easier to work with.

    2. Re:Does it really matter? by Anonymous Coward · · Score: 2, Insightful

      False. According to the Top 500 computer survey from November, 2012 (Category: Accelerator/Co-Processor), 87% of systems are not using any type of GPU co-processor, and 77% of the processing power is coming from the CPU.

      This is, however, a decrease from the June 2012 survey, so GPU is certainly making inroads, but it is not yet the main source of computation.

      http://www.top500.org/statistics/list/

      I still remember when the IBM Blue architecture came out, using embedded PowerPC processors and it was a huge power savings. It was a big deal, but far from a complete solution (limitations in RAM with no disk/swap).

      There is certainly a growing demand for a better power/performance solution in order to reduce total cost of operation. The individual performance of each processor doesn't matter as much when you have applications which are written to take advantage of 100,000s of processors in parallel.

    3. Re:Does it really matter? by Anonymous Coward · · Score: 0

      On the most recent (November 2012) Top500 list, there are only 15 clusters in the top 100 using GPUs for compute.

    4. Re:Does it really matter? by Junta · · Score: 5, Informative

      Of the last published top500 list, 7 out of the top 10 had no GPUs. This is a clear indication that while GPU is defintely there, claiming 'Most of the actual processing power' is overstating it a touch. It's particularly telling that there are so few as overwhelming the specific hpl benchmark is one of the key benefits of GPUs. Other benchmarks in more well rounded test suites don't treat GPUs so kindly.

      --
      XML is like violence. If it doesn't solve the problem, use more.
    5. Re:Does it really matter? by symbolset · · Score: 5, Interesting

      These ARM cores are halfway between the extremely limited GPU cores and the extremely flexible X86 cores. They may be the "happy medium".

      --
      Help stamp out iliturcy.
    6. Re:Does it really matter? by KiloByte · · Score: 5, Informative

      Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.

      --
      The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
    7. Re:Does it really matter? by XaXXon · · Score: 1

      It really doesn't seem like portability should be a huge goal for writing code for top-100 supercomputers. The cost of the computer would dwarf (or at least be a significant portion of) the cost of developing the software for it. It seems like writing purpose-built software for this type of machine would be desirable.

      If you can cut the cost of the computer in half by doubling the speed of the software, it seems a valid fiscal tradeoff, and the way to do that would be to write it for purpose-built hardware.

    8. Re:Does it really matter? by Victor+Liu · · Score: 2

      On the point or portability, there's then a distinction of your focus. If you do research on numerical methods, then yes, you would write highly optimized code for a particular machine, as an end in and of itself. I myself am merely a user, and our research group does not have the expertise to write such optimized code. We pay for time on supercomputing clusters, which constantly bring online new machines and retire old ones. Every year our subscription can change, and we are allowed to use resources on different computers. Therefore, from my standpoint, portability is very important. Otherwise, if we were to write our own code in-house, we basically have a 1 year (ok, fine, maybe 2 or 3 year) window in which to develop, test, and run it. It just doesn't seem worthwhile to spend so much effort developing a one-time use piece of code. I'd rather write something which will outlive my stay in the research program.

    9. Re:Does it really matter? by azi · · Score: 1

      It really doesn't seem like portability should be a huge goal for writing code for top-100 supercomputers.

      It depends on what are you doing. If you have relatively short term project (say less than couple of years) you are right. But if you do some serious long term research, portability comes with huge impact. Especially if your project has relatively large code base. Thing is, architectures comes and goes and you can't count what architecture you are going to use five years from now. Another thing is that you might use wide variety of computing sites, when portability is essential, really.

      --

      bash: sig: command not found

    10. Re:Does it really matter? by JanneM · · Score: 2

      System and numerical libraries and compilers are of course written specifically for the machine. But user-level apps (and a lot of scientific computing uses finished apps) are ported across multiple systems.

      Portability is not as big an issue as it was a generation ago, as most supercomputers basically are Linux machines today, and made to more or less look like a typical Linux installation from a user-application level, with a POSIX API; pthreads, OpenMP and OpenMPI; a standard set of numerical libraries; and often even gcc-compatibility in order to minimize the effort of porting. A notable exception is GPU-based machines (that are in the minority today, despite the OP assertion); they don't have a common API to write for, so using them is substantially harder at a user-level.

      And at a user level (but unike system libs) porting or coding time very much matters. Let's say your project is going to need a month of wall-clock computing time during the course of a year or two. If switching to a GPU-based system would shrink that by 50% - two weeks - then the effort to move your model code, app, and libraries had better take less than two weeks of work or you're going to waste project time, not save it.

      --
      Trust the Computer. The Computer is your friend.
    11. Re:Does it really matter? by MichaelSmith · · Score: 1

      Its the same for ARM. Java doesn't run properly yet because of the floating point limitations of ARM.

    12. Re:Does it really matter? by ThePeices · · Score: 5, Funny

      Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.

      nonsence, I play Farcry3 on my GPU, and it renders branches just fine thank you very much.

    13. Re:Does it really matter? by Anonymous Coward · · Score: 0

      > there might be a new wave of technology (like ARM) that could be easier to work with.

      Really? You went with "ARM" and not Xeon Phi?

    14. Re:Does it really matter? by Zo0ok · · Score: 1

      Isn't the ironic thing here, that ARM is also not very good at branching? No branch prediction - that at least used to be the case.

    15. Re:Does it really matter? by serviscope_minor · · Score: 1

      These ARM cores are halfway between the extremely limited GPU cores and the extremely flexible X86 cores. They may be the "happy medium".

      Not at all. They are much more like slow x86 processors. They can branch just as well, but are much slower and don't have a narrow very high performance sweet spot like GPUs.

      I somewhat expect AMDs new unreleased APUs to be the happy medium. Not as much grunt or memry bandwidth as a discreet GPU, but still some stream processors and much easier to program.

      --
      SJW n. One who posts facts.
    16. Re:Does it really matter? by AmiMoJo · · Score: 1

      Not really. The main difference between ARM and x86 cores in this application is that ARM has an equally flexible but lower performance ALU. For scientific applications that is a good trade off because performance tends to be mostly dependent on the FPU and on things like network and memory latency.

      In other words it is hard to max out an x86 core constantly in a supercomputer so much of its performance is unused. ARM does away with the bits that are less critical which results in lower power consumption and price, so you can either take the savings or have more of them.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    17. Re:Does it really matter? by Anonymous Coward · · Score: 0

      Then again, only 62 systems of the 500 in the latest list are using accelerators. The top 10 are using more accelerators than the rest on average.

    18. Re:Does it really matter? by Anonymous Coward · · Score: 0

      That's an implementation issue, not an ISA one.

    19. Re:Does it really matter? by tepples · · Score: 1

      ARM has predication: execute or don't execute a particular instruction based on the result of a previous instruction. It's like branching past one instruction at a time, and it doesn't stall the pipeline.

    20. Re:Does it really matter? by Rockoon · · Score: 1

      It depends on what are you doing. If you have relatively short term project (say less than couple of years) you are right.

      I've got to take issue with this statement. Anything that takes over a couple years probably should not be started on new silicon as it doesnt make sense to start them yet due to Moores law. The guy that starts the same project a year from now using the same amount of money that you used will beat you to the final calculation and get the hookers and blow that you thought that you deserved.

      The only time it makes sense is when the hardware is otherwise at end of life, that there is no longer an initial investment to get started, and there is unlikely to be anyone else willing to tackle the project anytime soon.

      --
      "His name was James Damore."
    21. Re:Does it really matter? by Rockoon · · Score: 1

      Perhaps. Pretty much any time I am doing some SSE coding I am thinking to myself "wouldnt it be nice of these registers were wider.. why doesn't someone in the x86 market just go ahead and make huge vector registers at least for addition, multiplication, and shifting" and then I realize that that is in fact where the APU's are at right now.. and think to myself "geeze I should be doing OpenCL not this hand-crafted SSE shit"

      --
      "His name was James Damore."
    22. Re:Does it really matter? by Anonymous Coward · · Score: 0

      I guess you work for mil / aero?

      I know this because
      [a] after careful review, your suggestion is that we never start this project
      [b] it ignores the time to develop the solution (which could already be > compute)
      [c] presumably in the mean-time you/team should be paid to do nothing
      [d] you think this would be accepted by management with an externally imposed deadline

    23. Re:Does it really matter? by excelsior_gr · · Score: 1

      Spot on. There are some software vendors that offer GPU-parallel codes, but GPU-clusters are still too exotic for a non-IT company to own. A simple server box can get you 64 cores with 4 GB RAM per core and that is all you need for fluid dynamic problems with up to, say, 15 million nodes or so, which is pretty decent for an engineering application. Fluent and CFX do not support GPU-processing. Other packages do, but then we would have to buy a new cluster to run only these new applications. A colleague hinted that software vendors may be going the "GPU-way" just to force companies that are not yet ready to embrace the technology to outsource their simulations to them. I think she had a point.

    24. Re:Does it really matter? by dkf · · Score: 1

      I've got to take issue with this statement. Anything that takes over a couple years probably should not be started on new silicon as it doesnt make sense to start them yet due to Moores law. The guy that starts the same project a year from now using the same amount of money that you used will beat you to the final calculation and get the hookers and blow that you thought that you deserved.

      He's not talking about the time to do one run of the calculation, but rather the time to develop the code and the time to let it be used by the target userbase for a while doing various things while supporting them. In that situation, there's no reason to not get started and to write portable code: you just transfer to new hardware when it becomes available and pick up all the Moore's Law improvements when you upgrade the kit.

      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
  6. slow clap is in order by decora · · Score: 0

    google 'slow clap copmilation youtube'

  7. Shows how dominant Intel have become by Anonymous Coward · · Score: 0

    Shows how dominant Intel have become that they were actually able to keep competing RISC processors out of many supercomputers for so long.

  8. Exactly. by Junta · · Score: 1, Interesting

    This isn't to say that ARM *can't* be there, but thus far all of the implementations have focused around 'good enough' performance within a tightly constrained power envelope. Intel's designs have traditionally been highly inefficient in that power band, but at peak conditions, it is still compelling.

    I recall one 'study' which claimed to demonstrate ARM as inarguably better. It got way more attention than they should have. The reason being is that they measured the performance on the ARM test, but just *assumed* TDP would be the accurate number for x86. There are very few workloads that would cause a processor to *average* TDP over the course of a benchmark.

    The thing that really *is* stealing x86 thunder is the GPU world. Intel's Phi strives to answer it, but thus far falls short in performance. There continue to be areas where GPU architecture is an ill fit, and ultimately I think Phi may end up being a pretty good solution.

    --
    XML is like violence. If it doesn't solve the problem, use more.
  9. Questions... by storkus · · Score: 5, Interesting

    As I understand it, Intel still has the advantage in the performance per watt category for general processing and GPUs have better performance per watt IF you can optimize for that specific environment--both things which have been commented to death endlessly by people far more knowledgeable than I.

    However, to me there are at least 3 questions unanswered:

    1. ASICs (and possibly FPGAs): Bitcoin miners and DES breakers are the best known examples. Where is the dividing line between where your operations are specific enough to emply an ASIC vs not specific enough and needing a GPU (or even CPU)? Could further optimization move this line more toward the ASIC?

    2. Huge dies: This has been talked about before, but it seems that, for applications that are embarrassingly parallel, this is clearly where the next revolution will be, with hundreds of cores (at least, and of whatever kind of "core" you want). So when will this stop being vaporware?

    3. But what do we do about all the NON-parallel jobs? If you can't apply an ASIC and you can't break it down, you're still stuck at the basic wall we've been at for around a decade now: where's Moore's (performance) law here? It would seem the only hope is new algorithms: TRUE computer science!

    1. Re:Questions... by XaXXon · · Score: 0

      Are you sure you know what moore's law is?

      http://en.wikipedia.org/wiki/Moore's_law .. might be worth a read.

    2. Re:Questions... by XaXXon · · Score: 2

      The reason for the question is that nothing in Moore's law says anything about single-threaded performance doubling every 1.5 years as many thing.

      Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years.

    3. Re:Questions... by Anonymous Coward · · Score: 1

      Given that he felt the need to specify (performance) in parenthesis, it appears that he does know what Moore's law and that he was referring to the impact Moore's law has traditionally had on single-threaded performance.

    4. Re:Questions... by AmiMoJo · · Score: 1

      In ASICs ARM is an ideal choice because you can built it right into the chip from a reference design. A lot of ASICs feature an 8502 core for management and I/O tasks, but if you needed to execute a more complex application than a simple ARM core running THUMB or even a full 32 bit ARM core would be ideal.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    5. Re:Questions... by tepples · · Score: 1

      A lot of ASICs feature an 8502 core for management and I/O tasks

      I thought only a Commodore 128-on-a-chip would have an 8502 core. What am I missing?

    6. Re:Questions... by Goaway · · Score: 1

      Did you see where he put "(performance)" in there?

    7. Re:Questions... by PDF · · Score: 1

      I think GP meant the Intel 8052 core.

  10. So, when can I buy an ARM ATX board? by LaughingRadish · · Score: 2

    Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.

    1. Re:So, when can I buy an ARM ATX board? by c0lo · · Score: 1

      Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.

      Why? Mini-ATX's not good for a commodity MB? 'cause you don't need a high google-fu to find heaps of them.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    2. Re:So, when can I buy an ARM ATX board? by LaughingRadish · · Score: 2

      Mini-ATX or Mini-ITX will do fine. I just haven't seen any that have the kinds of things you take for granted on x86 boards. I want an ARM board with SATA ports, PCIe slots, and DIMM (or SODIMM) slots. Is that too hard to produce? I don't see anything like this anywhere.

    3. Re:So, when can I buy an ARM ATX board? by 0123456 · · Score: 1

      Ditto. I went looking for an ARM board last time I built a home server, but found nothing that could compete in the slightest against a $90 Atom board.

    4. Re:So, when can I buy an ARM ATX board? by c0lo · · Score: 1

      Slowly, they start to appear.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    5. Re:So, when can I buy an ARM ATX board? by stinerman · · Score: 1

      I've been looking for the same, and never came across those. Thanks.

      Now I have another quibble. ARM CPUs are always soldered on to the board. They can't be upgraded w/o upgrading the entire board. I'm waiting for the day when you can build from scratch a rig using an ARM CPU just like you would with an x86 CPU, using commodity parts from NewEgg, etc.

    6. Re:So, when can I buy an ARM ATX board? by dfghjk · · Score: 1

      The third board is the same as the first, and none are what the OP asked for..."something that's eassentially just like a commodity x86 motherboard except it uses ARM." I suppose if mini-PCIe is your idea of slots and PCs don't include DVI or DisplayPort connections, or run commodity OSes and applications then they slowly are starting to appear. Hey, embedded boards in PC form factors are PCs so long as you don't have to use them, right?

    7. Re:So, when can I buy an ARM ATX board? by RightSaidFred99 · · Score: 1

      Why, so you can buy a processor that is a fraction as fast as an i7? Whoohoo! Uhh, good times?

    8. Re:So, when can I buy an ARM ATX board? by petermgreen · · Score: 1

      Sure you can find a few dev boards or industrial embedded computing boards where the vendor happened to use the mini-itx form factor rather than something custom which is handy if you want to slap one of them in an off the shelf case for whatever reason. Such boards have been arround for some time.

      But they lack the features one takes for granted on atom mini-itx boards like multiple SATA ports, a regular PCI or PCIe slot*, support for more ram (most arm boards have 1GB with the occasional board with 2GB just coming onto the market while most atom boards support 4GB with newer boards supporting 8GB). I also noticed that none of your links mentioned price. Shaving even a few tens of watts is not worth it if the board is double the price of an equivilent atom soloution.

      Interestingly the Marvell armada XP platform can in principle offer most of what you find on atom mini-itx boards but I haven't seen anyone build such a board arround it. The only armada XP hardware i've seen is the crazily overpriced and somewhat strangely set-up (lots of network ports, hardly any SATA) openblocks stuff, the even more overpriced baserock slab and the vapourware dell copper..

      *one of them has miniPCIe but that has a far smaller selection of cards and while in theory you can use adaptors in practice you'll never get it into a standard case if you do which kinda defeats the object of standard form factors

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
  11. No, they won't. by Dputiger · · Score: 5, Informative

    Current ARM processors may indeed have a role to play in supercomputing, but the advantages this article implies don't exist.

    Go look at performance figures for the Cortex-A15. It's *much* faster than the Cortex-A9. It also draws far more power. There's a reason why ARM's own product literature identifies the Cortex-A15 as a smartphone chip at the high end, but suggests strategies like big.LITTLE for lowering total power consumption. Next year, ARM's Cortex-A57 will start to appear. That'll be a 64-bit chip, it'll be faster than the Cortex-A15, it'll incorporate some further power efficiency improvements, and it'll use more power at peak load.

    That doesn't mean ARM chips are bad -- it means that when it comes to semiconductors and the laws of physics, there are no magic bullets and no such thing as a free lunch.

    http://www.extremetech.com/computing/155941-supercomputing-director-bets-2000-that-we-wont-have-exascale-computing-by-2020

    I'm the author of that story, but I'm discussing a presentation given by one of the US's top supercomputing people. Pay particular attention to this graph:

    http://www.extremetech.com/wp-content/uploads/2013/05/CostPerFlop.png

    What it shows is the cost, in energy, of moving data. Keeping data local is essential to keeping power consumption down in a supercomputing environment. That means that smaller, less-efficient cores are a bad fit for environments in which data has to be synchronized across tens of thousands of cores and hundreds of nodes. Now, can you build ARM cores that have higher single-threaded efficiency? Absolutely, yes. But they use more power.

    ARM is going to go into datacenters and supercomputers, but it has no magic powers that guarantee it better outcomes.

    1. Re:No, they won't. by Anonymous Coward · · Score: 0

      This * 1000

      The whole 1000 ARMs will replace 50 Xeons at the same power/performance is such silly nonsense, I dunno how it keeps getting plugged on slashdot.

    2. Re:No, they won't. by Lennie · · Score: 1

      Didn't Intel say that bringing down the cost and improving the performance of the interconnect was the goal of silicon photonics and they are now very close to mass production.

      However I don't know how power efficient it is.

      Could silicon photonics help close that gap ?

      --
      New things are always on the horizon
  12. I want by EmperorOfCanada · · Score: 2

    I have long pined for a server with maybe 10 4 core ARM CPUS. Basically my server spends its time serving up web stuff from memory. Each web request needs to do a bit of thinking and then fire the data out the port. Disk IO is not an issue nor is server bandwidth. Quite simply I don't need much CPU but I need many CPUs. A big powerful intel is of less interest.

    Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.

    1. Re:I want by 0123456 · · Score: 1

      Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.

      I suspect you mean it would have to be conjured up, or you'd spend all the time waiting to access RAM on other cores rather than doing anything useful.

    2. Re:I want by zbobet2012 · · Score: 1

      Its called NUMA, and we already have it in the Linux Kernel. By the way it is very cheap these days to pick up a server with 64 or more cores that fits in a 1U / 2 processor server.

    3. Re:I want by EmperorOfCanada · · Score: 1

      I would love to know where to get a cheap 64 core 1U server. And I don't mean that in the usual snarky slashdot (I think you're wrong) way but I truly would love to know.

    4. Re:I want by Anonymous Coward · · Score: 0
    5. Re:I want by zbobet2012 · · Score: 2

      Supermicro 1u 64 cores. Bunch of other Mobos (some more than 1u) on this page. Cheap is relative to the buyer I suppose, but to my (admittedly very large) company these things are rather cheap unless you start stacking them with lots of dense memory.

    6. Re:I want by gl4ss · · Score: 1

      your description sounds like you would benefit more from them having separate memories as well. otherwise a "big powerful intel" would fit the bill, getting higher throughput of requests.

      --
      world was created 5 seconds before this post as it is.
    7. Re:I want by petermgreen · · Score: 1

      Some quick estimation (read: looking at online vendors but not shopping arround carefully for best prices nor carefullly checking compatibility) puts the cost of a basic system built round that board and with all slots filled with 16 core processors at the order of $5K.

      I guess that may be cheap to you, it certainly isn't to me.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    8. Re:I want by sl3xd · · Score: 1

      Such systems exist.

      Even Dell sells them.

      ARM is quite capable of competing with Intel, but it is no magic bullet.

      CPU Core power usage is only part of the overall system power usage.

      What you want is a number of web requests served per second. You can have a fast quad-core x86_64 chip do the job, or you can have many more (considerably slower) ARM cores do the job.

      In the end, the number of requests/second is similar, as is the power consumption.

      You can't get around the laws of physics - there is a minimal amount of power to perform an operation. Intel chips use more power because they perform more operations.

      --
      -- Sometimes you have to turn the lights off in order to see.
    9. Re:I want by EmperorOfCanada · · Score: 1

      But if I have 100 arm chips they will be very sleepy when not in use. I find server usage spikes and jumps so they can come alive and absorb the blows. Also some processes go stupid so having the ability to have many jammed threads and still have the server run find is great.

      I just see intel as being more efficient in a perfect world and ARM being more resilient to an imperfect world. Plus I would love to see some fresh blood in the server world.

  13. Xilinx Zync anybody? by Z00L00K · · Score: 4, Informative

    Has anybody else seen/considered the Xilinx Zync? It's a mix of ARM kernels and FPGA, which could be interesting in supercomputing solutions.

    For anyone willing to tweak around with it there are development boards around like the ZedBoard that is priced at US$395. Not the cheapest device around, but for anyone willing to learn more about this interesting chip it is at least not an impossible sum. Xilinx also have the Zynq®-7000 AP SoC ZC702 Evaluation Kit which is priced at US$895, which is quite a bit more expensive and not as interesting for hobbyists.

    Done right you may be able to do a lot of interesting stuff with a FPGA a lot faster than an ordinary processor can and then let the processor take care of stuff where performance isn't a critical part.

    Those chips are right now starting to find their way into vehicle ECUs, but it's still in an early phase so there aren't many mass produced cars yet with it.

    As I see it - supercomputers will have to look at every avenue to get maximum performance for the lowest possible power consumption - and avoid solutions with high power consumption in standby situations.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    1. Re:Xilinx Zync anybody? by janisozaur · · Score: 2

      there was a successful kickstarter campaign some time ago that introduced precisely those chips [Zynq-7020] at quite affordable prices: http://www.kickstarter.com/projects/adapteva/parallella-a-supercomputer-for-everyone/ they are going to have them available for sale soon at http://www.parallella.org/ after the kickstarter pledges are fulfilled. at $100 + shipping they are far more affordable than what you mention, they are also committed to having most (if not all?) things open, so be sure to check out their website.

    2. Re:Xilinx Zync anybody? by Anonymous Coward · · Score: 0

      They use FPGA's in high frequency trading.

      They take loads of power though so they will never be used in Supercomputing.

      If they want to use something they would make it into an asic.

  14. I understand all right - try reading full posts! by dbIII · · Score: 1

    "keeping the things fed with memory can be slower than doing it on a CPU in the first place" is the line you've missed and is why GPUs don't solve every highly parallel problem at the moment. They can do reverse time migration, but can't currently do time migration, depth migration, tomography etc etc. The penalties of swapping so much memory in and out are far too costly, to the point of orders of magnitude of performance or complete showstoppers where you just can't get enough in for it to work at all.

  15. One Size Doesn't Fit All -- Same in Supercomputing by gentryx · · Score: 4, Informative

    There is already one line of supercomputers built from embedded hardware: the IBM Blue Gene. Their CPUs are embedded PowerPC cores. That's the reason why those systems typically have an order of magnitude more cores than their x86-based competition.

    Now, the problem with BG is, that not all codes scale well with the number of cores. Especially when you're doing strong scaling (i.e. you fix the problem size, but throw more and more cores on the problem), then the law of Amdahl tells you that it's beneficial to have fewer/faster cores.

    Finally I consider the study to be fundamentally flawed as it compares the OEM prices of consumer-grade embedded chips with retail prices of high-end server chips. This is wrong for so many reasons... you might then throw in the 947 GFLOPS, $500 AMD Radeon 7970, which beats even the ARM SoCs by a margin of 2x (ARM: ~1 GFLOPS/$, AMD Radeon: ~2 GFLOPS/$).

    --
    Computer simulation made easy -- LibGeoDecomp
  16. Power Efficiency - MIPS vs ARM by Taco+Cowboy · · Score: 2

    I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

    If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

    --
    Muchas Gracias, Señor Edward Snowden !
    1. Re:Power Efficiency - MIPS vs ARM by julesh · · Score: 4, Insightful

      I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

      If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

      I don't think it is. Best figures (albeit somewhat out-of-date) I can find for a MIPS-based system is 2GFLOPS/W for a complete 6-core node including memory. ARM Cortex A15 power consumption is a little hard to track down, although it's suggested that a 4-core 1.8GHz configuration (eg Samsung Exynos 5) could run at full speed on 8W (if the power manager let it; the Exynos 5 throttles down when it consumes more than 4W). Performance per GHz/core is about 4GFLOPS, so this system should be able to pull in about 28.8GFLOPS (or twice that if using ARM's "NEON" SIMD system to full advantage). Add in ~2W for 1GB DDR3 SDRAM, and that's 2.9GFLOPS/W. Assuming that the MIPS system I found is not the best available (as the data was from 2009 it certainly seems likely better is available now), the two appear to be roughly comparable.

    2. Re:Power Efficiency - MIPS vs ARM by Anonymous Coward · · Score: 0

      That and MIPS has a history in HPC, back when SGI was still in the game. There's little risk-taking involved in going MIPS. We already know the arch can scale to stupid levels.

    3. Re:Power Efficiency - MIPS vs ARM by niftymitch · · Score: 2

      I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture

      If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

      MIPS is an under invested older but great technology.
      Another historic winner was the DEC Alpha.

      As the folk at Transmeta (and others) demonstrated logic to decode any random ISA and drive a RISC core faster than the old VAX microcode days is very possible. This seems to be the way of modern processors. So ARM/x86/x86_64 ISA almost does not matter except to the compiler and API/ABI folk. If you want to go fast feed your compiler folk well.

      --
      Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
    4. Re:Power Efficiency - MIPS vs ARM by Bert64 · · Score: 2

      Another advantage of MIPS is that 64bit MIPS is already mature, having been around since the early 90s... 64bit ARM on the other hand is new and not widely supported yet.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    5. Re:Power Efficiency - MIPS vs ARM by KonoWatakushi · · Score: 2

      As the folk at Transmeta (and others) demonstrated logic to decode any random ISA and drive a RISC core faster than the old VAX microcode days is very possible. This seems to be the way of modern processors. So ARM/x86/x86_64 ISA almost does not matter except to the compiler and API/ABI folk. If you want to go fast feed your compiler folk well.

      One of the best ways you can help the compiler folk is with an orthogonal and sensible architecture. Furthermore, consider that generating good code is a problem that must be solved for every language, so starting with a good ISA makes for a lot less work.

    6. Re:Power Efficiency - MIPS vs ARM by K.+S.+Kyosuke · · Score: 1

      If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp

      You must be Chinese. Or, to be more specific, Chinese are already doing exactly the thing you've just suggested. :-) More power to them, I say. (Not POWER, mind you!)

      --
      Ezekiel 23:20
    7. Re:Power Efficiency - MIPS vs ARM by Anonymous Coward · · Score: 0

      The Ingenic SoC's are pretty competative, with a claim of 140mW per GHz per core.

      http://www.intomobile.com/2013/01/08/ingenic-and-mips-demonstrate-new-systemonchip-10inch-android-tablet-ces/

      Loogson 3B is a pretty interesting design, with 128 GFLOPS (256GFLOPS single precision) in 40W on a 65nm process. If shrunk to 32nm, you could probably get about 1 2/3 to doulbe the permonace states.

    8. Re:Power Efficiency - MIPS vs ARM by Anonymous Coward · · Score: 0

      Hell no, Cray-1 all the way!

  17. That's what is so funny to me by Sycraft-fu · · Score: 4, Insightful

    Slashdot seems to have lots of ARM fanboys that look at ARM's low power processors and assume that ARM could make processors on par with Intel chips but much more efficient. They seem to think Intel does things poorly, as though they don't spend billions on R&D.

    Of course that would beg the question as to why ARM doesn't and the answer is they can't. The more features you blot on to a chip, the higher the clock speed, and so on, the more power it needs. So you want 64-bit? More power. Bigger memory controller? More power. Heavy hitting vector unit? More power. And so on.

    There's no magic ju ju in ARM designs. They are low power designs, in both sense of the word. Now that's wonderful, we need that for cellphones. You can't be slogging around with a 100 watt chip in a phone or the like. However don't mistake that for meaning that they can keep that low consumption and offer performance equal to the 100 watt chip.

    1. Re:That's what is so funny to me by AmiMoJo · · Score: 1

      The point is that an ARM processor can provide, say, 75% of the performance for 25% of the power compared to x86. You can see it in tablet computers, particularly those running Windows RT or Ubuntu where a direct comparison is possible. Since most of the bottlenecks are not due to processing power but rather disk, RAM, graphics rendering, network etc. you very quickly reach the point of diminishing returns with increasing CPU performance.

      In the case of supercomputers the same things applies. You might want a fast FPU/GPU but the CPU core isn't too critical. Your application might need vast quantities of data that take time to move around so it doesn't matter if processing it takes a bit longer. You might benefit from simply having more cores to make your task more parallel since each core only requires 50% as much power and cooling.

      You seem to think that for a chip to be high performance everything has to be upgraded at once. If your tasks are vector FPU intensive you can have a fast vector FPU, it doesn't mean you also must have a big memory controller or higher ALU core speed.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    2. Re:That's what is so funny to me by gl4ss · · Score: 1

      yeah well, we'll see when it does 75% performance for 25% of power. it doesn't. you can't see it in tablets right now. that's what next gen is supposed to fix. but the next gen arm design is going to use more power to get there.

      (incidentally memory access, network etc are all slower on arm and for most supercomputing they do matter)

      it is a bit boring to read these articles now for a decade though. "intel is dead due to arm in two years!! yeehaw!!". they were even more boring back in the day when intel was manufacturing an arm design though..

      --
      world was created 5 seconds before this post as it is.
    3. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      Slashdot seems to have lots of ARM fanboys that look at ARM's low power processors and assume that ARM could make processors on par with Intel chips but much more efficient.

      This is false. It's widely acknowledged that ARM processors don't come at the top on MFLOP counts or on those artificial performance benchmarks.

      What is true is that they do have enough performance for doing everyday work. Tablet computers demonstrated that quite nicely.

      Not everyone needs to play crysis on full resolution. Or even graphics-intensive computer games.

    4. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      The way I see it, ARM fanboys want to see a more diverse market, more innovation and more competition. It's not an argument about ARM vs. x86 ISA, it's a desire to the Wintel world crumble.

    5. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      There's a plausible argument to be made that Intel is hampered, to some extent, by sticking with the x86 architecture, with assorted legacy cruft. ARM might be able to get better performance per Watt, without any magic engineering advantage, simply by dropping some of that baggage.

      I don't know whether this is the case or not, but it's within the realm of possibility, based on my limited understanding.

    6. Re:That's what is so funny to me by TopSpin · · Score: 3, Interesting

      There's no magic ju ju in ARM designs.

      The magic ju ju is the ARM business model. There is one trump card ARM holds that precludes Intel from many portable devices; chip makers can build custom SOCs in-house with whatever special circuits they want on the same die. Intel doesn't do that and they don't want to do it; it would mean licencing masks to other manufactures like ARM does. For example, the Apple A5, manufactured by Samsung, includes third party circuits like the Audience EarSmart noise-cancellation processor, among others. It is presently not feasible to imagine Intel handing over masks such that Apple could then contract with some foundry to manufacture custom x86 SOCs. This obviates Intel from many portable use cases.

      That feature of the ARM business model might be very useful to large scale computing. One can imagine integrating a custom high-performance crossbar with an ARM core. Cores on separate dies could then communicate with the lowest possible latency. Using a general purpose ARM core to marshal data to and from high-performance SIMD circuits on the same die is another obvious possibility. A custom cryptography circuit might be hosted the same way.

      Contemporary supercomputers are great aggregations of near-commodity components. However, supercomputing has a long history of custom circuit design and if the need arises for a highly specialized circuit then a designer may decide that integrating with ARM to do the less exotic leg work computing that is always necessary is a good choice.

      --
      Lurking at the bottom of the gravity well, getting old
    7. Re:That's what is so funny to me by dfghjk · · Score: 1

      "ARM fanboys want to see a more diverse market..."

      Fanboys aren't into diversity, they're into cheering for "their team". It's tribalism, not merit.

    8. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      I think you're the only one that gets it. ARM will displace x86 in the server space, not because their cores are superior to Intel but because they have a superior business model that's amenable to building heterogeneous systems which will lower system power and increase system performance.

    9. Re:That's what is so funny to me by Dputiger · · Score: 1

      "The point is that an ARM processor can provide, say, 75% of the performance for 25% of the power compared to x86. "

      http://www.anandtech.com/show/6529/busting-the-x86-power-myth-indepth-clover-trail-power-analysis

      From the article: "Ultimately I don't know that this data really changes what we already knew about Clover Trail: it is a more power efficient platform than NVIDIA's Tegra 3."

      This single data point doesn't mean 32nm Atom is more power efficient than any other ARM device, but it illustrates that the gap you imply is device-specific -- not inherent to the microarchitectures. There are going to be ARM devices that are more power efficient than Atom and Atom devices that are more power efficient than ARM.

    10. Re:That's what is so funny to me by Dputiger · · Score: 1

      "I think you're the only one that gets it. ARM will displace x86 in the server space, not because their cores are superior to Intel but because they have a superior business model that's amenable to building heterogeneous systems which will lower system power and increase system performance."

      I don't think you understand. In an era when moving data is extremely inefficient due to power constraints, you need superior cores. Want heterogeneous many-core? Buy Xeon Phi. Want strong-single thread? Buy high-end Xeon. Want maximum power efficiency with leading single-thread performance? You buy Bay Trail.

      Need a lower power interconnect? Buy parts with photonics (in 5-6 years, anyway).

      Can ARM creates products that fit well in this market? But they aren't going to waltz in and displace Intel. They may pick up 10-15% of the market; they're not going to sweep.

    11. Re:That's what is so funny to me by Solandri · · Score: 1

      Of course that would beg the question as to why ARM doesn't and the answer is they can't. The more features you blot on to a chip, the higher the clock speed, and so on, the more power it needs. So you want 64-bit? More power. Bigger memory controller? More power. Heavy hitting vector unit? More power. And so on.

      Isn't this just RISC (ARM) vs CISC (x86/x64) all over again? While I won't rule out the possibility that RISC may win this time around, it's rather telling that CISC has won all previous rounds.

    12. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      The other side of the coin, of course, is that Intel chips are addressing the power issue directly. They might not be as energy efficient, but Intel isn't exactly sitting around doing nothing about it. ARM is great, but my guess is the assumptions in these conversations will shift 2 years from now when we have Haswell and post-Haswell chips as part of the discussion. Not that Haswell is some magic bullet, but I think it will shift public perceptions of things a bit.

    13. Re:That's what is so funny to me by Dputiger · · Score: 1

      It's less true than people think. The power consumption penalty of x86 is low single-digits. While it exists, Intel compensates for it with better sleep states and superior silicon.

    14. Re:That's what is so funny to me by marcosdumay · · Score: 1

      In an era when moving data is extremely inefficient due to power constraints, you need superior cores.

      That's not what the data you posted say. It says that you need better processors, not cores.

      There is no reason why you can't create a better processor with worse cores. Well, or maybe there is, but it's not obvious in any way, and I doubt it's known at all... And it's quite likely that you can improve our current processors with heterogeous designs.

    15. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      In an era when moving data is extremely inefficient due to power constraints, you need superior cores. Want heterogeneous many-core? Buy Xeon Phi. Want strong-single thread? Buy high-end Xeon.

      Of course those have to be integrated on the same chip eventually to create that heterogeneous multi-core. The ARM camp has a possibility of getting there first (in the HPC market) if they really push it. In the sense of avoiding data movement, you seem to agreed with the parent. The need of superior cores really stems from the problem of strong scaling, though.

    16. Re:That's what is so funny to me by Anonymous Coward · · Score: 0

      ARM sucks. Why the fuck do you think I want a 386 wannabe processor anywhere near me. That's right ARM 1-7 is no faster than a 386 and uses the same amount of power. That's right, max power usage was 400mA on the 386DX-40. The 386SLC used even less. Their latest ARM offerings are as fast as a Pentium II. Oooooh! Ahhhhhh! ARM sucked 30 years ago. ARM sucked 20 years ago. ARM still sucks. I guess when all the know-everything young adults of today quit buying the ARM marketing lie they will understand this too.

    17. Re:That's what is so funny to me by sl3xd · · Score: 1

      75% of the performance for 25% of the power compared to x86

      The problem is they don't provide anything approaching that sort of efficiency.

      I've had the, um, privilege of benchmarking a few of the new up-and-coming ARM server systems and chips. It's pretty neat to be able to have four quad-core servers, each with 4GB of memory, pulling a total of 40W or so. That's a great system for a web server farm.

      The problem is when you compare the throughput vs performance for high performance computing. For a few workloads, the new ARM systems compare favorably - giving a small edge in work done per watt. The performance advantage per watt in these workloads is usually less than 5%.

      The best-case for the ARM systems, under workloads that are most ideal for the ARM systems in question: 105% of the performance for the same power draw.

      If your workload doesn't scale to a large number of cores easily, or has a large amount of inter-process communication, the current ARM systems are hopeless. Even with a supposedly high-performance backplane in a chassis hosting around 40 ARM nodes, it was soundly trounced by one x86_64 node.

      Note this is for a cluster system, so many x86_64 and ARM nodes are being used for the benchmark.

      One node X86-64 can handle the workload of 40+ ARM nodes. Granted, you can fit 40+ ARM nodes in a 3U chassis, but it's still a lower overall compute density than with 1U x86_64 nodes. Simply throwing more cores at the problem doesn't necessarily give you the gains you'd think.

      While ARM is theoretically capable of better performance/watt, it's impossible to get anything resembling theoretical in a supercomputing application. Any advantage ARM has in performance/watt is eaten up by the overhead of having to use so many (slower) cores. Very few workloads scale linearly as you throw in more cores, as various overheads (MPI, network, etc.) decrease the overall efficiency dramatically.

      Currently, you can't use ARM for memory-hungry applications, as you'll hit the 4GB limit. 64-bit ARM is promising, but it's also not for sale.

      The best performance per watt for supercomputing workloads is still found in accelerators, such as GPU's or Intel's Xeon Phi.

      ARM is very promising for many datacenter type workloads, where there are a large number of unrelated, independent processes, such as a farm of web servers. (Any database backend is, however, a different matter).

      While slightly OT (as it's a non-supercomputing application): What if you want to use an application that uses Java server-side? Forget it. The current ARM JVM's (both openJDK as well as Oracle's) both appear to lack JIT; the only way to get Java to have a similar performance/watt between ARM and x86_64 is to disable JIT on x86_64. This is largely a software issue, but until it's fixed, forget about Java on an ARM server.

      --
      -- Sometimes you have to turn the lights off in order to see.
  18. Re:I understand all right - try reading full posts by symbolset · · Score: 1

    Frankly I agree with you. I'm thinking the average /. reader will find your post incoherent though.

    --
    Help stamp out iliturcy.
  19. Re:I understand all right - try reading full posts by dbIII · · Score: 1

    I think my initial general comment about memory is properly aimed at a high school level readership Mr "sixth level cache" :)

  20. Not only Performance per $ by gentryx · · Score: 2

    ...but also reliability (because supercomputers are really large and one failed node will generally crash the whole job, thereby wasting gazillions of core hours; that's one reason why SC centers buy expensive Nvidia Tesla hardware instead of the cheaper GeForce series) and IO and memory bandwidth and finally integration density. That one Intel chip can be more tightly integrated as it won't generate as much excess heat per GFLOPS (according to TFA...).

    --
    Computer simulation made easy -- LibGeoDecomp
  21. cant wait! by Anonymous Coward · · Score: 0

    Well you dont expect them to just go and outperform the others, they are obviously gonna take time optimizing, but what interests me more is more competition and something new to look forward to.
    Cheers,

  22. Not this week... by niftymitch · · Score: 2
    Not this week....
    I am a fan boy for the small ARM boards... I have built an MPI cluster out of Raspberry-Pi boards and it is not even close except as a teaching exercise where it excels.

    However many site services can be dedicated to these little boards where corp IT seems to dedicate virtual machines.

    Department Web Servers... with mostly static content... via NFS or a revision control system like hg.
    Department and internal caching name servers... NTP servers and managed central storage for each building or closet.

    The impact of the little ARM boards has kicked Intel in their lethargy-loaded-behind. Their next generation sub 25 Watt systems will take names and kick but as long as IT does not overload them with WindowZ.

    IT departments will find that the management advantage of chromebox devices connected to quality screens compelling.

    Users will find that flipping open the company ChromeOS laptop will put them on the same page as the big screen in the office...

    It is true that this is not 100% ready for prime time for all of us but the handwriting is on the wall.

    --
    Truth is stranger than fiction, but it is because Fiction is obliged to stick to possibilities; Truth isn't. Mark Twain.
  23. It's Mont-Blanc, not Mountblanc by Anonymous Coward · · Score: 0

    It's Mont-Blanc, not Mountblanc.

  24. And every watt needs to be cooled. by Anonymous Coward · · Score: 1

    Not something you care about with a mobile phone, but with a HPC system you really DO care about every watt dissipated.

  25. GNU/Linux on ARM by tepples · · Score: 2

    A gimp version of windows is not going to get the job done.

    On the other hand, a Windows version of GIMP does get a lot of jobs done that don't quite need Adobe Photoshop.

    But seriously, the reason Windows RT is "gimped" is because Microsoft has refused to endorse recompiling desktop applications. That's not a failing of ARM, as ARM ran RISC OS on Acorn computers, as much as a power grab by Microsoft.

    Some of the Samsung Slate tablets however come with an x86...and are actually fully functional! Can you point to an ARM tablet that can do everything it can?

    Some ARM tablets run Ubuntu. Other Android tablets run Debian in a chroot, with video out through an X11 server app for Android. These can't run Windows applications in Wine the way x86 applications do, but they work for any GNU/Linux application that has been recompiled for ARM.

    1. Re:GNU/Linux on ARM by Redmancometh · · Score: 1

      I enjoyed your post thoroughly especially the GIMP joke! Also it's rather informative. I'm glad Ubuntu can be put on an ARM tablet, as that actually makes me consider getting one. However most employers aren't going to want to spend the labor-hours training it's workers on a new platform.

      For IT workers that is pretty awesome though. I think I just got pissy over that guy saying x86 is a useless pile of crap. I don't really remember.

  26. Well, there's a load of bollocks. by Anonymous Coward · · Score: 0

    Scientific computing has ALWAYS been bespoke for the big iron. Because each scientific model is unique to the problem domain and the ideas of how it is going to be solved.

    The compiler is relied upon to produce the most optimal code resulting from the (usually FORTRAN) source and the computer libraries called are optimised to the machine that they run on.

    I work for the UK Met Office and I've not heard any different from any other big computing resources that do any different.

  27. An ISA is only as good as... by tepples · · Score: 1

    An ISA is only as good as its most efficient implementation.

  28. ARM can win by Reliable+Windmill · · Score: 1

    I think ARM can win this, because it has a superior, more streamlined ISA. x86 is a relic, a dinosaur, and it's all over the place. Just like x86 can do low-power designs, ARM can also reach the same performance, but also with a smaller and more power efficient implementation thanks to its refined ISA.

    --
    Signature intentionally left blank.
    1. Re:ARM can win by RightSaidFred99 · · Score: 1

      You are delusional then. x86 translation hardware is a tiny part of modern Intel or AMD processors. You don't seem to understand that Intel can come down in power much more easily than ARM can go up in performance. This is born out by the fact that Intel now has competitors to ARM that are as good or better in some cases in terms of power usage for the cell phone market.

      Watch what happens in 1 year to the tablet market. Yes just one year. ARM will only be in sub $200 tablets - everything else will belong to Atom.

  29. Sure, but by pem · · Score: 1

    That advantage goes away if your core is superscalar -- you still have issues with branching and not keeping the queue full. Some versions of x86 superscalar can execute both sides of branches, then discard the results of the branch not taken. There is no reason that an architecture with an ARM instruction set could not do this; but then some of the power-per-watt benefits would be leveled out.

  30. Re:One Size Doesn't Fit All -- Same in Supercomput by LordVader717 · · Score: 1

    Typical supercomputing tasks usually scale quite well. (Otherwise there's little point of running it on a supercomputer in the first place). Which is of course why GPUs are so interesting.

  31. Someone explain to me.... by Anonymous Coward · · Score: 0

    why then, are both of the new video game consoles moving to x86-based architectures?

  32. Re:One Size Doesn't Fit All -- Same in Supercomput by gentryx · · Score: 1

    True, but with limits. There is a reason why LRZ bought SuperMUC without GPUs: a) fewer, faster cores, b) users didn't have to change their codes. Now, machines like BG/Q scale extremely well, despite having such a high core count. But they have the interconnect built right into the chip architecture. We don't have anything comparable on current ARM designs, but hey, the future is gonna be interesting.

    --
    Computer simulation made easy -- LibGeoDecomp
  33. Then let's see it. by Sycraft-fu · · Score: 1

    I keep hearing this kind of thing from ARM fans. Ok, show it to me. You can't, because it doesn't exist, nor anything even close to it. What that means is you are just hoping this is the case, making things up, not that it is actually the case.

  34. The only real limitation by FithisUX · · Score: 1

    is the lack of an IOMMU by default on all ARMs.

  35. there is one super power.... by cheekyboy · · Score: 1

    And that is, ARM has many makers/sellers , but intel..... is just one intel..... a single source for all your $$$$. More than one ARM source is better for competition.

    --
    Liberty freedom are no1, not dicks in suits.
  36. CPUs need built in RTG by cheekyboy · · Score: 1

    How much thorium is needed to power a cpu for 5 years.

    --
    Liberty freedom are no1, not dicks in suits.
    1. Re:CPUs need built in RTG by White+Flame · · Score: 1

      Nuclear power generation involves converting heat to electricity. That normally means boiling water to drive steam turbines or, in the case of satellites, using thermocouples. In either of these cases, there is a lot of waste heat generated. You don't want that right on your silicon in a densely packed environment.