Slashdot Mirror


Intel Launches 72-Core Knight's Landing Xeon Phi Supercomputer Chip (hothardware.com)

MojoKid writes: Intel announced a new version of their Xeon Phi line-up today, otherwise known as Knight's Landing. Whatever you want to call it, the pre-production chip is a 72-core coprocessor solution manufactured on a 14nm process with 3D Tri-Gate transistors. The family of coprocessors is built around Intel's MIC (Many Integrated Core) architecture which itself is part of a larger PCI-E add-in card solution for supercomputing applications. Knight's Landing succeeds the current version of Xeon Phi, codenamed Knight's Corner, which has up to 61 cores. The new Knight's Landing chip ups the ante with double-precision performance exceeding 3 teraflops and over 8 teraflops of single-precision performance. It also has 16GB of on-package MCDRAM memory, which Intel says is five times more power efficient as GDDR5 and three times as dense.

117 of 179 comments (clear)

  1. 22nm is old school for Intel by nicoleb_x · · Score: 1

    "The cores are 14nm versions of Silvermont, rather than 22nm P54C"

    1. Re:22nm is old school for Intel by Hylandr · · Score: 1

      > if you aren't a liberal when you're very young you don't have a heart, if you're still a liberal after you grow up you don't have a fucking brain.

      *High Five*

      I feel exonerated of my youthful days !!

      --
      ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
  2. LOL ... Crikey ... by gstoddart · · Score: 4, Insightful

    So, somewhere someone at AMD is going "fuck it, we're going to 128 cores".

    Damn ... that's a crap pile of cores ... that's like, Skynet in a box or something.

    The mind reels.

    --
    Lost at C:>. Found at C.
    1. Re:LOL ... Crikey ... by PhunkySchtuff · · Score: 2

      Yep, but it'll be 128 integer cores with 64 floating-point cores, and someone will take them to court over it... because... butthurt.

    2. Re:LOL ... Crikey ... by AK+Marc · · Score: 1

      And someone, somewhere is muttering "256 cores should be enough for anyone".

    3. Re:LOL ... Crikey ... by Type44Q · · Score: 1

      Damn ... that's a crap pile of cores

      IMHO the only metrics that aren't subjective are transistor count and process tech...

    4. Re:LOL ... Crikey ... by jones_supa · · Score: 1

      That should be enough for many entry level systems, and even some light gaming, well with the right graphics card...

      NVIDIA GTX 750 Ti is the perfect match.

    5. Re: LOL ... Crikey ... by IBME · · Score: 1

      If I can cut transcode down to a few minutes for say a dvd that normally takes hours, I would be delighted. Obviously a million dollar data center would not use these. Playboy entrepreneur billionaires? Yes. Me no

    6. Re:LOL ... Crikey ... by Coren22 · · Score: 1

      Um, I am not sure you understand the scale of this product. They are fitting 72 cores and 16GB or ram into approximately 2inch by half inch. Think this:

      http://www.newegg.com/Product/...

      Not the size of a blade. This is a tiny thing that could go in a laptop as a co processor for who knows what.

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
    7. Re:LOL ... Crikey ... by BigFootApe · · Score: 1

      How about the Mach 20?
      https://www.youtube.com/watch?v=2FAP8o5ZEo0

    8. Re:LOL ... Crikey ... by Methadras · · Score: 1

      Yes, but will it play Crysis though?

  3. Cool! by QuietLagoon · · Score: 1

    I want one of these in my next notebook....

    1. Re:Cool! by Anonymous Coward · · Score: 1

      You'd be lucky to get 5 minutes of battery life out of it.

    2. Re:Cool! by QuietLagoon · · Score: 1

      Yeah, but what a fun five minutes it could be. :)

    3. Re:Cool! by TheDarkMaster · · Score: 1

      Four minutes. And glowing red :-D

      --
      Religion: The greatest weapon of mass destruction of all time
    4. Re:Cool! by gstoddart · · Score: 3, Funny

      Bah, turn on Flash and IE and it'll be 2 minutes and glowing white. ;-)

      --
      Lost at C:>. Found at C.
    5. Re:Cool! by ls671 · · Score: 1

      Don't worry, I read it has a power saving mode and switches to 2 core while on batteries making it more efficient than a celeron.

      --
      Everything I write is lies, read between the lines.
  4. Imagine a Beowulf Cluster of these... by frnic · · Score: 1

    Just saying

    1. Re:Imagine a Beowulf Cluster of these... by ClickOnThis · · Score: 5, Funny

      That chip is a Beowulf cluster.

      --
      If it weren't for deadlines, nothing would be late.
    2. Re:Imagine a Beowulf Cluster of these... by ls671 · · Score: 1

      Then, that would make a Beowulf cluster of Beowulf clusters.

      --
      Everything I write is lies, read between the lines.
    3. Re:Imagine a Beowulf Cluster of these... by SuricouRaven · · Score: 2

      A meta-beowulf.

  5. The video is complete drivel by Required+Snark · · Score: 2
    If you watch the video in the linked article is is 100% buzzword marketspeak with zero information content. Disruptive technology blah blah integration blah blah innovation blah blah software continuity blah blah...

    It is probably a good chip for it's niche, so you would think they would have less bloviation in their intro video. If this was anyone else I would assume they were mostly trying to fleece more investors before they inevitably went belly up. It's so bad that major league sports style animation with yelling pitchman and a pounding beat would be an improvement. That bad.

    --
    Why is Snark Required?
    1. Re:The video is complete drivel by Anonymous Coward · · Score: 2, Informative

      Better article.

      Also summery is wrong: its on the 14nm process (the previous gen one was 22).

      Really the memory looks like the only interesting thing here.

    2. Re:The video is complete drivel by MojoKid · · Score: 1

      That has been corrected. It is 14nm.

  6. Application? by unixisc · · Score: 1

    So what exactly is the real world application of such a beast? Are there that many x64 based supercomputers out there?

    1. Re:Application? by Galactic+Dominator · · Score: 1

      They pretty much all are, Los Alamos, Blue Waters, Oak Ridge National Laboratory, all the medium to long term weather forecasting I know of, etc.

      Generally 10,000 to 100,000+ nodes.

      --
      brandelf -t FreeBSD /brain
    2. Re:Application? by gl4ss · · Score: 2

      raytrace version of wolfenstein, pretty much.

      cost effectivity for other uses.. well..

      --
      world was created 5 seconds before this post as it is.
    3. Re:Application? by Jeremi · · Score: 4, Funny

      Bitcoin mining, of course -- it may not be as fast as a similarly-priced GPU farm, but the coins it creates will be of the highest possible quality and workmanship.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    4. Re:Application? by byornski · · Score: 1

      With the exceptions of the IBM machines, which held most of the top spots in 2012. The LLNL one is still #3... http://www.top500.org/list/201... and the ANL #5. The general idea was cheaper processors that doesn't do pipelining, but OTOH the cost of them is massively decreased and they all had an instruction unit capable of two threads per core so the onus is on the programmer to make sure that the alu was kept constant fed.

    5. Re:Application? by dbIII · · Score: 1

      Heaps of x86 based supercomputers, but these are a bit bandwith limited by the bus they are plugged into and how chatty these things are.
      For applications when a working dataset is small enough that you can fit it on these cards they are apparently very good. If you need to shift a lot of stuff in from main memory on frequent occasions they are not and the AMD systems hooked together with infiniband look a lot better. For things that benefit from a huge amount of shared memory (2TB plus onboard and 160 cores or more) Intel Xeons on IBM boards interconnected look better.

      Due to being on the pcie bus these things compete directly with the nvidia stuff despite being x86.

    6. Re:Application? by angel'o'sphere · · Score: 1

      Does it execute x86 code? Does it support virtualization? I guess you could use it then to host lots of Linux VMs.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    7. Re:Application? by JanneM · · Score: 1

      The Japanese K built by Fujitsu uses Sparc64.

      --
      Trust the Computer. The Computer is your friend.
    8. Re:Application? by TheRaven64 · · Score: 2

      So what exactly is the real world application of such a beast?

      All of the things where you really, really wish that you could do GPU offloading, but can't because you have diverging flow control and the GPU version ends up coming nowhere near to the theoretical peak performance of the hardware. The Xeon Phi cores are pretty simple, but there are loads of them, they have real branch prediction and caches (so handle the same kind of workloads as normal CPU cores, just a bit slower) and have fairly beefy vector units (so when they're running in a straight line they're actually pretty fast). Anything that you can make run on multiple threads should work nicely on them. That includes a load of HPC code that uses OpenMP, but which doesn't map particularly well to a CPU programming model without significant rewriting and redesigning of the core algorithms.

      --
      I am TheRaven on Soylent News
    9. Re:Application? by drinkypoo · · Score: 1

      So what exactly is the real world application of such a beast? Are there that many x64 based supercomputers out there?

      The two fastest supercomputers in the world are x86_64 based, as are in fact all but three of the top ten.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    10. Re:Application? by LWATCDR · · Score: 1

      " Are there that many x64 based supercomputers out there?"
      Yes.
      Including the number one and number two on the top 500 list.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    11. Re:Application? by U2xhc2hkb3QgU3Vja3M · · Score: 1

      That may explain why the Bitcoins I'm mining on my AMD CPU end up as Dogecoins.

    12. Re:Application? by MachineShedFred · · Score: 1

      It does execute x86. However, I'm pretty sure that the VM hypervisor would need to be tailored to use these, and your memory bandwidth is severely reduced because it sits on a PCI-e link.

      These things are made for the same workloads that people use CUDA and OpenCL for. Seriously parallel processing with small-ish data sets.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    13. Re:Application? by SoftwareArtist · · Score: 1

      The architecture isn't really that different from a GPU, whatever Intel might try to make you believe. It has 512 bit vectors, compared to 1024 bit vectors on NVIDIA, so it's slightly less hurt by divergent flow control, but only slightly. The theoretical maximum teraflops (8 for single, 3 for double), are pretty similar to what NVIDIA is claiming for the just announced M40.

      And don't forget, Intel massively hyped the first generation of MIC, and it then turned out to be next to worthless. Hopefully they've fixed their problems this time around, but until we start seeing real world performance on third party codes, I'm going to be skeptical about everything they say.

      --
      "I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
  7. Why Intel doesn't utilize the latest node on it? by Anonymous Coward · · Score: 1

    I also wonder the reason behind Intel's decision on using 22nm on the Knightslanding instead of the latest sub-10mn node that it has

    Perhaps the latest node is not stable - or perhaps Intel wants to tap out the max value of whatever they had invested in the old 22nm node

  8. 3Tflops by kelemvor4 · · Score: 1

    So Intel is top dog given that nVidia is only producing 2.3Tflops, right?
    I guess AMD gave up on HPC. If I read the wiki right, their top card does 0.1Tflops

    1. Re:3Tflops by kelemvor4 · · Score: 2

      Guess I read it wrong. FirePro S9170 produces 2.6Tflops.

    2. Re:3Tflops by Anonymous Coward · · Score: 1

      Ahem. What?

      http://vrworld.com/2015/09/07/amd-r9-fury-x-potential-supercomputing-monster/

      Intel's KL isn't interesting for its total throughput. It's interesting in that the cores are all fully-programmable x86 CPUs. Furthermore, KL can fit into board sockets (no, not conventional LGA2011) and is fully bootable as well. What that means is that KL can be its own supercomputer node, whereas something like FirePro S requires a host system to serve as a node.

      (yes, the Fiji Fire Pro numbers are currently theoretical . . . that won't last long)

    3. Re:3Tflops by jon3k · · Score: 2

      I think the real question is FLOP/Watt. I really don't know how the two will stack up. Might also depend on whether or not the stream processors in nvidia gpus are better suited to the workload than x86 cores?

    4. Re:3Tflops by Anonymous Coward · · Score: 1

      I see my $1K NVIDIA Titan X report about 6.8 TFLOPS for single-precision float, but its double-precision is so bad I think it might not even exist and is really just running on the host CPU in that case.

    5. Re:3Tflops by byornski · · Score: 1

      The problem with cuda cores isn't that they aren't fully programmable. The problem is that branches are expensive and the copying of data to and fro are expensive unless you can do the entire calculation within the graphics card which is memory dependent.

    6. Re:3Tflops by byornski · · Score: 1

      The maxwell 2.0 chips apparently don't have double precision fp operations natively. PGI (owned by nvidia) refuse to release a compiler for openacc (openmp for graphics cards) based on it citing this reason.

    7. Re:3Tflops by dbIII · · Score: 1

      That was what Transmeta thought but their customers didn't. FLOP/$ is what matters more (sometimes) since the power bill over a lifetime is going to be less than the difference in price between a mid range AMD system (64 cores ~$10k) and a top end Intel system (80 faster hyperthreading cores ~$80k).

    8. Re:3Tflops by TheRaven64 · · Score: 1

      FLOPS/Watt matters a lot to the customers of this kind of thing. When you're spending a few tens of millions on the supercomputer, you really don't care what the CPUs or accelerator cores cost. You do care about power consumption though, because that translates to cooling requirements and directly limits how big a system you can build.

      --
      I am TheRaven on Soylent News
    9. Re:3Tflops by dbIII · · Score: 1

      So please explain why Transmeta didn't take off despite aiming directly for that metric and why there are so many power hungry Xeons out there.
      You can't explain it?
      You don't know?


      Note to posters - please do not counter specific examples with a gut feeling - it makes you look like an idiot.

    10. Re:3Tflops by Junta · · Score: 1

      nvidia is pumping out 2.9 TFlop DP on their K80 (on paper). Of course on paper the numbers are as good as imaginary (across the board, Rpeak has been more and more a fantasy over time).

      --
      XML is like violence. If it doesn't solve the problem, use more.
    11. Re:3Tflops by raftpeople · · Score: 1

      ...and non-sequential/scattered read write patterns are difficult to implement efficiently. I had a problem that either had sequential input and non-sequential/scattered output or the opposite (either way worked) and it really didn't match well with GPU mem access methods.

    12. Re:3Tflops by Coren22 · · Score: 1

      Did someone urinate in your cereal this morning?

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
    13. Re:3Tflops by TheRaven64 · · Score: 1

      So please explain why Transmeta didn't take off despite aiming directly for that metric

      and good performance, there was no reason to buy Transmeta processors. Note that they're not actually dead: nVidia bought Transmeta and used their ideas in their Project Denver ARM SoCs, which are selling pretty well now, in a different market where performance-per-Watt does matter.

      and why there are so many power hungry Xeons out there.

      There aren't. Xeons are so popular precisely because they give you the best performance within a given power envelope that you can currently buy (unless you're willing to go with custom accelerators or less general cores such as GPUs). Xeons became really popular once they started beating Opterons in performance per Watt. The new Cavium ThunderX parts claim to be in the same ballparks (better on some workloads, worse on others), but there hasn't been much competition for Xeons for a while.

      --
      I am TheRaven on Soylent News
    14. Re:3Tflops by MachineShedFred · · Score: 1

      It may matter a lot, but it's not the most important metric.

      If you're building a supercomputer, you're building it to calculate shit, and to calculate it as accurately as possible, as fast as possible. You design the computer first, and then the facility to house it after the design is done.

      Someone dropping tens of millions on a super computer isn't going to say "well, we already have this room here that can handle X watts of heat, so design your computer to simulate global weather patterns / thermonuclear detonations / astrophysics for that thermal capacity. After all, we absolutely cannot retrofit the room to add more cooling capacity."

      A $200k HVAC upgrade is nothing compared to the $millions of equipment going into the room.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    15. Re:3Tflops by spire3661 · · Score: 1

      FLOP/watt calculation is bounded by time and operating budget.

      --
      Good-bye
    16. Re:3Tflops by SoftwareArtist · · Score: 1

      You must be looking at one of the low end NVIDIA GPUs. Tesla K40 gets 4.29 teraflops. Tesla M40 (just announced last week), supposedly gets 7.

      --
      "I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
    17. Re:3Tflops by dbIII · · Score: 1
      With respect Mr gut feeling your ramblings are what is known as secondary considerations, as you would know if you did a bit more than guessing and going with the first thing that sounds right.

      Xeons became really popular once they started beating Opterons in performance per Watt

      They are already popular despite that not happening yet. Wrong guess. Maybe try something other than a guess next time?

    18. Re:3Tflops by TheRaven64 · · Score: 1

      They are already popular despite that not happening yet. Wrong guess. Maybe try something other than a guess next time?

      Where are you getting your numbers from? That's why we bought them, and it's why the companies that I talk to who buy them in lots of a thousand buy them. In the P4 days, we were buying Opterons almost exclusively. I think it's been five years since we last bought one.

      --
      I am TheRaven on Soylent News
    19. Re:3Tflops by dbIII · · Score: 1

      That's why we bought them

      Since many-way opterons with a lot of cores a bit slower than the leading edge still win that metric I very much doubt it.
      You ignored my numbers before so why should I state them again - just look a little back in the thread and you'll see a couple of examples of why capital cost is considered before running costs in this case. There is normally a pile of other things that get considered as well before flops/watt gets a say.

      I think it's been five years since we last bought one

      Most likely because single threaded performance is being considered far ahead of flops/watt. Why don't you ask the person that made the decision instead of guessing and insultingly acting to us as if you are the one that made the decision?

  9. McRAM? by tlambert · · Score: 1

    McRAM?

    Yes, I would fries with that.

    1. Re:McRAM? by ClickOnThis · · Score: 1

      Yes, I would fries with that.

      Fries? I think a bad puppy like this could deep-fry a whole turkey for you.

      --
      If it weren't for deadlines, nothing would be late.
    2. Re:McRAM? by Svenberg · · Score: 1

      Here in the UK you'd get it with Chips....

      (ok, I'll show myself out...)

  10. CISC? by zoid.com · · Score: 2

    I've been asleep for 20 years so I guess CISC won?

    1. Re:CISC? by ndykman · · Score: 3, Informative

      Kind of. The advantages of RISC faded pretty fast. The footprint of a decoder between something like x86 and say, ARM is really not that much, and a decoder is just a small part of a core these days. Clock speed is an issue of thermal footprint. So, all the disadvantages of the x86 (and it's x64 extensions) faded in the face of Intel's focus on process improvements. In the end, not even the Itanium could eek out enough of a win to dethrone the x86 architecture.

    2. Re:CISC? by radarskiy · · Score: 1

      Neither CISC or RISC won.

      Data-driven design won out over faith-based instruction set architecture.

    3. Re:CISC? by angel'o'sphere · · Score: 1

      I would not call it faith ;) There where compelling reasons that once all processors where CISC.
      Considering that x86 is the only majour CISC processor left, and is translating internally CISC instructions into sets of RISC instructions before they get executed and considering that everything else, that is bigger than an 8 or 16 bit micro controller is RISC (Arm, Mips etc.) I would say: RISC has won.

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
    4. Re:CISC? by TheRaven64 · · Score: 1

      It's 20 years since 1989?

      --
      I am TheRaven on Soylent News
    5. Re:CISC? by LWATCDR · · Score: 1

      No x86/x64 won.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    6. Re:CISC? by MachineShedFred · · Score: 1

      The complexity of the instruction set matters very little when you can just cache the decoded instructions in the processor. Intel solved that with Pentium Pro in 1995. By using ever-decreasing fabrication processes, they have die space to heap tons of cache in there - I think the current Xeons are somewhere around 2MB/core of cache...

      So your 20 year nap is just about right.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    7. Re:CISC? by spire3661 · · Score: 1

      20 years since this: https://www.youtube.com/watch?...

      --
      Good-bye
  11. Re:Nobody want to say anything? by TheDarkMaster · · Score: 1

    Windows response, a big red letter message:

    "WTF?"

    --
    Religion: The greatest weapon of mass destruction of all time
  12. Waiting for "kilocore" to be a common measure... by jeffb+(2.718) · · Score: 1

    ...but I suppose 640 kilocores should be enough for anybody.

  13. Knight's Landing, Knight's Corner by Megahard · · Score: 1

    So how fast can it calculate a Knight's Tour

    --
    I eat only the real part of complex carbohydrates.
  14. Re:Why Intel doesn't utilize the latest node on it by Anonymous Coward · · Score: 5, Interesting

    Defects in the process bleeding edge process are the main reason to use the older process. When they make one of these insane multi-core parts the die size is very large (sometimes taking up a whole 26 by 32mm scanner field) thus the yields are hit harder by defects. On a more consumer level chip they may have 4 or more die in a scanner field. A single defect in this field will take out one of the four die resulting a a yield of 75% for that field. However in the case of a single die for the whole field the yield would be zero with the exact same number of defects per mm^2. I am sure they have a greater understanding of where their defects come from on the older 22nm process these days and can ensure good yields even with a huge die size.

    An additional reason they would use the older process is a chip of this level of complexity probably requires tighter overlay and critical dimension (CD) control than the "standard" 22nm process to work well. Having a well defined process makes tuning all of these factors much easier and it also helps decouple if it was it the process or possibly a issue in the design when initial silicon runs do not work exactly as intended.

  15. Re:Why Intel doesn't utilize the latest node on it by MojoKid · · Score: 2

    This has been corrected in the post. Intel is in fact using their 14nm node for Knights Landing.

  16. Will it play Crysis? by beltsbear · · Score: 1

    Or be able to load Windows 11?

    1. Re:Will it play Crysis? by U2xhc2hkb3QgU3Vja3M · · Score: 1

      But it'll BSOD a hell of a lot faster!

  17. Re:Fuck You! by ClickOnThis · · Score: 1, Insightful

    You are truly sick. Get some help. You, and the one who modded you insightful.

    All I did was provide a useful link. And you go nuclear-fractal about it, exploding with invective and unsubstantiated speculation.

    --
    If it weren't for deadlines, nothing would be late.
  18. Re:Fuck You! by Jeremi · · Score: 1

    I'm pretty sure that Anonymous Coward's response was itself propagating some other kind of meme, but I'm not motivated enough to look it up.

    Anyway, the thing is to take inappropriately over-the-top invective with a grain of salt, since it was probably intended to be tongue-in-cheek.

    --


    I don't care if it's 90,000 hectares. That lake was not my doing.
  19. Re: Fuck You! by Anonymous Coward · · Score: 1

    Are you a well adjusted, happy person? Ima guessing no.

  20. Excellent! by AndyKron · · Score: 1

    Excellent! Now we'll be able to process even more bullshit widgets on websites!

  21. Still just 4 cores for the desktop... by Moof123 · · Score: 1

    I am still annoyed that Skylake still only comes with 4 meager cores and some lousy graphics I will mever make use of, and anything beyond that is a hockey stick price increase. Taunting us with 72 is just cruel.

    1. Re:Still just 4 cores for the desktop... by Blaskowicz · · Score: 1

      It still won't allow to have a baby in a month.

    2. Re:Still just 4 cores for the desktop... by Joe_Dragon · · Score: 1

      They also don't plan to have a desktop / same socket intel Xeon chip with out build in video. For the last gen you can get a 4 core + HT chip for about $100 less then a i7.

    3. Re:Still just 4 cores for the desktop... by U2xhc2hkb3QgU3Vja3M · · Score: 2

      Have you tried having nine wives?

    4. Re:Still just 4 cores for the desktop... by MachineShedFred · · Score: 1

      I would imagine that the built-in video is actually wanted in the Xeon line, so you don't have to waste motherboard real estate adding a crappy video chip to the bill of materials.

      Many, if not practically every, server uses on-board video. Unless they run completely headless.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    5. Re:Still just 4 cores for the desktop... by Joe_Dragon · · Score: 1

      But an 16-32MB video chip with it's own ram is better then eating system ram and it can be an issue in multi cpu systems.

  22. News for (computer architecture) nerds... by ndykman · · Score: 3, Interesting

    While supercomputing is a very small section of the computing world, it's not that hard to understand.

    First of all, this would make for a terrible graphics card. This (deliberately) sits between a CPU and GPU. Each core in a Phi has more branching support, memory space, more complex instructions, etc than a GPU core, but is still more limited than a Xeon core (but it has wider SIMD paths).

    A GPU has many more cores that have a much more limited set of operations, which is what is needed for rapid graphics render. But, those limited sets of operations can also very useful in scientific computing.

    I haven't seen anybody try a three pronged approach (CPU/Phi/Nvidia Tesla), but I will admit I didn't look very hard. This is all in the name of solving really big problems.

    1. Re:News for (computer architecture) nerds... by waTeim · · Score: 1

      The specific application? I'm not sure, but as for the nature of the application that could make use of this is a combination of traditional mechanisms and neural networks. Thus AI couched in big data. Spark of course would be part of it. There are a number of NN frameworks that make use of cudNN. The first because so much has already been written assuming a standard kind of architecture; think OpenBlas. Newer supervised learning results revolve around neural networks and GPUs are best for that for even though the TFLOP comparison might be similar, the real limitation on NN is memory bandwidth, and GPU win here. Plus all the additional circuitry dedicated to double precision is wasted on NN, but valuable for say PCA. Trivia question; what's the register size on the Phi? It's 512 bits, which can be manipulated to vectorize, but how to do that? Only a compiler can manage that, and both gcc and LLVM do so, at a higher layer and longer term what language is best suited for this? I'm betting on Julia since the language is oriented to these problems and is centered on LLVM but mostly because of the people.

  23. If only the software... by Sir+Holo · · Score: 1

    I have eight (8) cores on my laptop. Frequently, a single multiprocessor-unaware application will hog an entire core, getting it hot, while asking nothing of the other seven (7). These applications are typically very expensive ones, so you might think that they would make use of them.

    Oh, but no. Give me two cores, 100 cores, or anywhere in between. I, as a power-user, will actually never notice a difference.

    Get the programmers to write MPA software. Only then will I think about believing the hype about multiple cores.

    1. Re:If only the software... by invictusvoyd · · Score: 2

      That is what happens when u use Windows.

    2. Re:If only the software... by Sir+Holo · · Score: 1

      That is what happens when u use Windows.

      Actually, this is what happens when you (I) use Adobe products.

      The open-source Image-J is far more agile in processing my 100,000+ image-stacks.

    3. Re:If only the software... by dbIII · · Score: 1

      Frequently, a single multiprocessor-unaware application will hog an entire core, getting it hot

      While that is very annoying at least the OS switches it over to another core every now and again to avoid overheating, as a process monitor like "gkrellm" with show you.

      Get the programmers to write MPA software

      Give them a break, developers are only just getting their teeth into 64 bit and you want them to write stuff as if it's 1999? Please give them at least twenty years to get used to the hardware :)

    4. Re:If only the software... by Blaskowicz · · Score: 1

      The imaginary cores do work, getting +30% out of them is ordinary even in games nowadays.

    5. Re:If only the software... by MachineShedFred · · Score: 1

      The good news is that a Xeon Phi isn't ever going to be installed anywhere but a data center, so you don't have to worry about it. It will churn through data sets by running an application specifically written for it.

      This isn't a high-volume product for Intel - they probably have a couple hundred customers that use these things. But when they do use them, they use a LOT of them because they are building supercomputers that have thousands of cores.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    6. Re:If only the software... by im_thatoneguy · · Score: 1

      Uhhhh, this is what happens when you use any application written in C, C++, Java, C#, PHP, Python or just about any programming language without adding threading code. The OS is irrelevant.

  24. Re:Fuck You! by Hylandr · · Score: 1

    Liberal detected.

    --
    ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
  25. Re:Fuck You! by Hylandr · · Score: 1

    I am waiting for the Mach30TurboLazer. Call me a Luddite. :)

    --
    ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
  26. Re:Why Intel doesn't utilize the latest node on it by jellomizer · · Score: 1

    Price for performance?
    The chips no matter how small their transistors are still need to fit on a standard die.
    To get these chips to run faster you can add more transistors and/or better optimize them for their use.
    When you focus on the latter there is lass of a case of running out of space.
    If we buy a bigger home we don't buy bigger furniture just more of it.

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
  27. Finally! by reboot246 · · Score: 1

    Something fast enough to run Minecraft!

    1. Re:Finally! by Coren22 · · Score: 1

      Not enough memory for Minecraft.

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  28. Re: Why Intel doesn't utilize the latest node on i by IBME · · Score: 1

    What IS really interesting is that the technology offered to the masses today is last years news, read: only what they want us to see.. I know for a fact that the big money has stuff we see in movies. Society is getting to the point where technology is literally climbing up our asses, which is why all the tremors over backdoors, encryption, and in general the mass confusion of information people are fighting over. Greed is quite literally become psychotic. News at 11

  29. Re:What are the applications? by RyuuzakiTetsuya · · Score: 1

    Non ironic Beowulf clusters.

    --
    Non impediti ratione cogitationus.
  30. Re:Nobody want to say anything? by Whatsmynickname · · Score: 1

    If Windows is now fast enough, Windows 11 will be out and this CPU will be minimum requirements.

  31. Not actually that impressive.... by Junta · · Score: 1

    So the product is Intel's not quite released compute accelerator, featuring new micro architecture, memory technology, and using the latest chip fab capabilities.

    The most readily available competition with released numbers is an nVidia K80, a year old product using 5 year old memory technology, 5 year old chip fab capabilities, Set to be superseded by their refresh using state of the art fab, memory, and microarchitecture, which would actually compete toe to toe with what Intel announced.

    This *should* make for an unambiguous trouncing of the nVidia product by Intel. So let's compare some metrics (not the best mechanism, but without real world numbers, settling for Rpeak and such).
    Compute stands at 10TFlops SP and 2.9 TFlops DP on the K80, meaning Intel's brand new offering doesn't reach the SP performance of an 'ancient' product and barely edges them out on DP.
    Memory capacity is actually lower than the K80 as well (16 GB intel v 24GB nvidia).

    There is of course chance that even when going toe to toe with Pascal, that the ability to actually extract the promised performance will be better, but given how this doesn't unambiguously trump the K80 on paper, it's quite likely that Pascal will be overwhelming.

    I also think the MIC product line will become redundant around Sky Lake dual socket time, when the main processor line starts having the AVX512 goodness.

    --
    XML is like violence. If it doesn't solve the problem, use more.
    1. Re:Not actually that impressive.... by asliarun · · Score: 1

      GPU floating point performance has been leading general purpose x86 CPU floating point performance by an order of magnitude - for many many years now. There's nothing new in what you are saying.

      What is indeed new is that this is the first general purpose x86 based solution that gives you similar floating point performance as a graphics card. And you get all the advantages of the general purpose CPUs as well as all the x86 codebase you might want to support.

      There must also be a reason why the number 1 supercomputer on the planet, the Tianhe 2, uses Xeon Phi.

      Oh, and while the dedicated RAM allocation is much smaller than the nVidia card in question, it is also much higher bandwidth, and is stacked RAM, similar to AMD's HBM.

    2. Re:Not actually that impressive.... by MachineShedFred · · Score: 1

      Yeah, Nvidia can compete toe-to-toe with their next-gen product, until a branch comes along. Branching on GPU compute is ridiculously expensive. This is not so with Xeon Phi.

      That's where this product makes sense.

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
  32. Re:Fuck You! by U2xhc2hkb3QgU3Vja3M · · Score: 1

    Sure! Here you go!

  33. Re: Why Intel doesn't utilize the latest node on i by Coren22 · · Score: 2

    Society is getting to the point where technology is literally climbing up our asses,

    So you are saying that "big money" has next generation butt plugs?

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  34. Re:Fuck You! by Coren22 · · Score: 1

    http://starwars.wikia.com/wiki...

    Are you sure you want to shave with one though, that might hurt...

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  35. Re:Why Intel doesn't utilize the latest node on it by boristdog · · Score: 1

    Yeah, my first thought on seeing that monster chunk of silicon: "Defectivity is going to make that thing expensive as hell."

  36. Re:Why Intel doesn't utilize the latest node on it by ssam · · Score: 1

    Are they not using the trick of selling chips with a defective core as a lower core count (like the old Phenom X3). I assumed that was why you get strange numbers like 61 cores.

  37. Re:Fuck You! by rochrist · · Score: 1

    Maybe you could, you know, stop posting as an anonymous coward before you go on this rant. It might make it slightly easier to impress people. Probably not though, given the rather lame content.

  38. Re:Fuck You! by Hylandr · · Score: 1

    It would certainly be a close shave. :)

    --
    ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
  39. Re:Fuck You! by Coren22 · · Score: 1

    It would give a whole new meaning to razor burn.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  40. Re:Fuck You! by uninformedLuddite · · Score: 1

    How about you go and fuck yourself you pathetic sack of shit

    --
    The new right fascists are bilingual. They speak English and Bullshit.
  41. Re:Why Intel doesn't utilize the latest node on it by beastofburdon · · Score: 1

    Another reason they may use the larger version is that they simply have much more room to work with. In a CPU there is a very small area that can be taken up by cores since so many things have been integrated these days like video processors and the north-bridge. Here they have an entire pci-e card to play with, so they can use the larger architecture without worrying about space.
    This is merely my conjecture though.