8-Core Intel Nehalem-EX To Launch This Month
MojoKid writes "What could you do with 8 physical cores of CPU processing power? Intel's upcoming 8-core Nehalem-EX is launching later this month, according to Intel Xeon Platform Director Shannon Poulin. The announcement puts to rest rumors that the 8-core part might be delayed, and makes good on a promise Intel made last year when the chip maker said it would release the chip in the first half of 2010. To quickly recap, Nehalem-EX boasts an extensive feature-set, including up to 8 cores per processor, up to 16 threads per processor with Intel Hyper-threading, scalability up to eight sockets via Intel's serial Quick Path Interconnect and more with third-party node controllers, and 24MB of shared cache."
Ah! My dream of the day when I can boot up and see penguins taking up the entire screen is almost here.
Now we know what will be needed to run Win 8, I guess.
I better get started on my backyard fusion power plant....;-)
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
Does it have the memory I/O bandwidth to keep up with the CPUs? When will I be able to actually buy a mother board with 8 of these 8 core CPUs, and what kind of a frame rate would Crysis get on that rig?
I've abandoned my search for truth; now I'm just looking for some useful delusions.
This processor is meant for servers, because they're Xeon, and with all the Web 2.0 and Cloud computing going on, servers are always hungry for more power.
This space for rent.
The end to "can it run Crysis?" jokes!
http://arstechnica.com/hardware/news/2009/09/ibms-8-core-power7-twice-the-muscle-half-the-transistors.ars
I am sure there are plenty of applications out there that can take advantage of this new hardware. I run finite element and computational fluid dynamics software at work and both are capable of using the 8 cores in my work PC (dual quad core).
The really sad part though is that for the FEA software I can only use 2 cores because the vendor requires customers to buy a separate HPC license for every processor/core beyond 2.
Software developers are going to have to figure out a new approach to licensing many of their products. VMware, for example, allows you to use a single license for every processor of 6 or fewer cores... how many people are going to pay for another license for the 2 extra cores? I see per core licenses coming in the near future.
Don't know about games, but many types of numerical processing can easily take advantage of this. ATLAS and other high-performance linear algebra libraries already use all available cores (no, IO is often not the biggest bottleneck with these libraries, as they seem to squeeze out all possible advantages from the L1 / L2 caches). In other words, for my scientific computations, I would definitely notice a difference.
Also, OpenMP is becoming easier and easier to use with recent gcc releases, and it only takes a few #pragma statements in some parts of the code to give a huge speedup if you know what you're doing and have appropriate code.
Does having a witty signature really indicate normality?
http://www.sun.com/processors/UltraSPARC-T2/
And the future Ultrasparc T3 will have 16 cores and 8 threads per core for a total of 128 threads per chip
http://arstechnica.com/business/news/2010/02/two-billion-transistor-beasts-power7-and-niagara-3.ars?utm_source=rss&utm_medium=rss&utm_campaign=rss
So can we now expect a doubling of cores every 18 months?
In other news, AMD has a blog article on it's soon to be launched competitor to this, Socket G32 8-core/12-core Opterons:
http://blogs.amd.com/work/2010/02/22/magny-cours-is-right-on-schedule-and-shipping-to-customers/
This article outlines the various circumstances under which hyperthreading either benefits or impedes performance. While it's true that on average the benefit was zero (meaning about half of what they tested was faster, and about half was slower) there are clearly a lot of applications that see significant performance gains.
It should also be noted that the applications that benefit are ones that would generally be used in Xeon (server and workstation) machines. Further, most of the applications that failed to benefit from hyperthreading are not written to take advantage of many (more than one or two) cores. As applications are updated for "many core" systems, it is likely that the benefit from hyperthreading will become more significant.
In any case, it is far from "established" that hyperthreading has "no benefit."
... super cool looking white plastic mold which fits my sochet and cool looking notepad!
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
Hyperthreading used to suck, but it works pretty well now. In the benchmarks I've done with my code I see about a 60% speedup.
http://www.vips.ecs.soton.ac.uk/index.php?title=Benchmarks#Results_summary
Better than that, with a properly multi-threaded web browser we'll be able to display sixteen animated Flash ads simultaneously with no slowdown!
This makes me sad. Web 2-point-Oh is such a waste of a perfectly good 8-core processor.
10 years ago if you had told me about an 8-core processor I would have imagined using it for kick-of-the-ass games, immersive virtual reality, editing 3D video and simulating newer, more deadly designs of chainsaw chain.
But noo, instead they are used to pump out inefficient JavaShit-based versions of the Desktop software we had in '93 with a shiny new rounded corner interface to web browsers around the world. Great.
So, how soon until newegg.com has the fake ones in stock?
I see even classic Slashdot is now pretty much unusable on dial up anymore.
Yea, it really bugs me how 95% of a web site's load time and processing load is accounted for by a few pretty features like rounded corners and drop shadows.
How about we put those effects into CSS where they below and not induce massive load by simulating them with 5mb of JavaScript?
I hate printers.
Am I the only one here who understands that client-side Javascript has absolutely nothing to do with how many cores your server has?
Web 1.0 can use plenty of cores, too, but generally your Web x.x requirements and your required server core count are orthogonal. Bandwidth and latency requirements for Web 2.0 are a different story, though. Those things tend to scale depending on how shitty your programmers are.
Who is talking about servers? I'm thinking about my home machines, you know, where the client-side javascript runs...
"linux is just DOS with a UNIX like syntax" -- Galactic Dominator (944134)
People have been arguing as you are that x86's bloated CISC instruction set was inferior to a cleaner RISC architecture for the last 20+ years. Nobody has ever proven that the elegance of the instruction set matters with hard data though.
What evidence we do have goes against that argument.
Apple machines used a cleaner RISC architecture for a while in the desktop space. They never performed any better than equivalent x86 based machines, and in the end Apple abandoned RISC and moved to x86.
Intel came out with a cleaner RISC based instruction set that that the Itanium line uses. If x86 was really as bad as you say, Itanium chips would be running circles around the x86 based server chips provided by both Intel and AMD. That isn't happenning.
Another thing you might not realize: all x86 chips, from both Intel and AMD, once you strip them down to the micro-code level ARE RISC designs under the hood. RISC is the cleaner way to implement the micro code and the underlying execution architecture, but all historical data seems to indicate that the question of whether the instruction set that sits on top of that is RISC or CISC is irrelevant to performance. It is arguably more complicated to design a CISC based chip like x86, but that clearly has not been an obstacle to competing with RISC on the performance end for Intel or AMD engineers.
In a minute there is time For decisions and revisions which a minute will reverse. -T.S. Eliot
According to my server metric graphs the additional threads are only useful for WIO CPU states.
For example, on Intel 4core i7 920 processor, enabling hyperthreading impersonates additional four cores. But CPU utilization reported by metrics software shows that USR and SYS cpu times will only go up to 50% and WIO will add another 12%. This corresponds to having a virtual core used for waiting to IO stuff. Additional 3 virtual cores do not serve anything at all.
People have been arguing as you are that x86's bloated CISC instruction set was inferior to a cleaner RISC architecture for the last 20+ years. Nobody has ever proven that the elegance of the instruction set matters with hard data though.
What evidence we do have goes against that argument.
The only evidence that we have is that the benefits of commoditization and economies of scale often outweigh any architectural advantages. The fact that x86 incorporated many elements of RISC would also demonstrate its value.
Apple machines used a cleaner RISC architecture for a while in the desktop space. They never performed any better than equivalent x86 based machines, and in the end Apple abandoned RISC and moved to x86.
Manufacturing processes simply trumped architectural differences. PowerPC's have never been manufactured on anywhere near the scale of x86.
Intel came out with a cleaner RISC based instruction set that that the Itanium line uses. If x86 was really as bad as you say, Itanium chips would be running circles around the x86 based server chips provided by both Intel and AMD. That isn't happenning.
Itanium is EPIC, not CISC. It is the exact opposite of RISC. It may not be running circles around x86, but that may be due to compilers not yet being advanced enough to take full advantage of the architecture. We may still see this change in the future.
http://astutehosting.com/
Not with an X25-E.
It will improve gaming performance if you happened to be running something like Quakes Wars in ray tracing.
Intel put together a demo on a workstation system with two Nehalem quad-core CPUs getting about 15 - 20 fps.
Since ray tracing is embarrassingly parallel, all one needs to do to improve performance is to throw more cores at it.
Keep in mind ray tracing is much more cpu intensive than gpu intensive...
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)