8-Core Intel Nehalem-EX To Launch This Month
MojoKid writes "What could you do with 8 physical cores of CPU processing power? Intel's upcoming 8-core Nehalem-EX is launching later this month, according to Intel Xeon Platform Director Shannon Poulin. The announcement puts to rest rumors that the 8-core part might be delayed, and makes good on a promise Intel made last year when the chip maker said it would release the chip in the first half of 2010. To quickly recap, Nehalem-EX boasts an extensive feature-set, including up to 8 cores per processor, up to 16 threads per processor with Intel Hyper-threading, scalability up to eight sockets via Intel's serial Quick Path Interconnect and more with third-party node controllers, and 24MB of shared cache."
Ah! My dream of the day when I can boot up and see penguins taking up the entire screen is almost here.
Run a REAL operating system, like VISTA!
http://rocknerd.co.uk
Now we know what will be needed to run Win 8, I guess.
I better get started on my backyard fusion power plant....;-)
Down With Slashdot BETA!!! I've been around the corner and seen the oliphant; you can only abuse me from your perspecti
Does it have the memory I/O bandwidth to keep up with the CPUs? When will I be able to actually buy a mother board with 8 of these 8 core CPUs, and what kind of a frame rate would Crysis get on that rig?
I've abandoned my search for truth; now I'm just looking for some useful delusions.
The end to "can it run Crysis?" jokes!
But how long before game makers and other software companies write code that can take advantage of all those cores? By the time they do, Intel or AMD will have mainstream 32 or 64 core processors on the market.
http://arstechnica.com/hardware/news/2009/09/ibms-8-core-power7-twice-the-muscle-half-the-transistors.ars
..they want their joke back. Windows 7 runs perfectly fine on 6 year old machines. But MS is known for making shitty OSes with alternate versions so Windows 8 may still suck... though initial impressions are that not much will change from Windows 7.
This space for rent.
Software developers are going to have to figure out a new approach to licensing many of their products. VMware, for example, allows you to use a single license for every processor of 6 or fewer cores... how many people are going to pay for another license for the 2 extra cores? I see per core licenses coming in the near future.
http://www.sun.com/processors/UltraSPARC-T2/
And the future Ultrasparc T3 will have 16 cores and 8 threads per core for a total of 128 threads per chip
http://arstechnica.com/business/news/2010/02/two-billion-transistor-beasts-power7-and-niagara-3.ars?utm_source=rss&utm_medium=rss&utm_campaign=rss
So can we now expect a doubling of cores every 18 months?
Why are they are still announcing hyperthreading? It was established long-ago that it had no benefit. It's been off on any machines I've ever purchased.
Now I can run all my crapware, viruses, trojans, malware, and other dubious software bits at FULL SPEED! Yay
Agent K: A *person* is smart. People are dumb, stupid, panicky animals, and you know it.
In other news, AMD has a blog article on it's soon to be launched competitor to this, Socket G32 8-core/12-core Opterons:
http://blogs.amd.com/work/2010/02/22/magny-cours-is-right-on-schedule-and-shipping-to-customers/
This article outlines the various circumstances under which hyperthreading either benefits or impedes performance. While it's true that on average the benefit was zero (meaning about half of what they tested was faster, and about half was slower) there are clearly a lot of applications that see significant performance gains.
It should also be noted that the applications that benefit are ones that would generally be used in Xeon (server and workstation) machines. Further, most of the applications that failed to benefit from hyperthreading are not written to take advantage of many (more than one or two) cores. As applications are updated for "many core" systems, it is likely that the benefit from hyperthreading will become more significant.
In any case, it is far from "established" that hyperthreading has "no benefit."
... super cool looking white plastic mold which fits my sochet and cool looking notepad!
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
Better than that, with a properly multi-threaded web browser we'll be able to display sixteen animated Flash ads simultaneously with no slowdown!
I'm pretty sure that x86 doesn't deserve the blame for the fact that on-die cache is astronomically expensive compared to offboard DRAM...
First off it went away for a long time. The P4s had hyper threading but the Pentium Ds and Core 2s (duos and quads) didn't. It didn't come back until the i7.
The other reason is that it is useful now. When HT first came out, it was pretty much for desktop chips and we were still very much a single core world. Ok well little was designed to truly take advantage of multiple threads in that environment. People noticed no real speedup. However now not only are things better using multiple cores, but the server market is a target for this as well. On servers, multiple threads per core in hardware work well. You frequently get situations where you have processors that don't need much processor time, but need it often. The context switching can be killer in terms of overhead. More processes on the chip mitigates that can makes more efficient use of the silicon.
Sun is doing this to a much greater degree, in fact. Their new Ultrasparc processors run more than two threads per core. Probably not that useful on a desktop at this point but it can be very useful on a web server.
Hyperthreading is something likely to stick with us at this point. We are moving away from computers that only did one thing at a time, and simply switched back and forth between tasks and towards computers that do a whole lot in parallel.
So, how soon until newegg.com has the fake ones in stock?
I see even classic Slashdot is now pretty much unusable on dial up anymore.
People have been arguing as you are that x86's bloated CISC instruction set was inferior to a cleaner RISC architecture for the last 20+ years. Nobody has ever proven that the elegance of the instruction set matters with hard data though.
What evidence we do have goes against that argument.
Apple machines used a cleaner RISC architecture for a while in the desktop space. They never performed any better than equivalent x86 based machines, and in the end Apple abandoned RISC and moved to x86.
Intel came out with a cleaner RISC based instruction set that that the Itanium line uses. If x86 was really as bad as you say, Itanium chips would be running circles around the x86 based server chips provided by both Intel and AMD. That isn't happenning.
Another thing you might not realize: all x86 chips, from both Intel and AMD, once you strip them down to the micro-code level ARE RISC designs under the hood. RISC is the cleaner way to implement the micro code and the underlying execution architecture, but all historical data seems to indicate that the question of whether the instruction set that sits on top of that is RISC or CISC is irrelevant to performance. It is arguably more complicated to design a CISC based chip like x86, but that clearly has not been an obstacle to competing with RISC on the performance end for Intel or AMD engineers.
In a minute there is time For decisions and revisions which a minute will reverse. -T.S. Eliot
I don't see anyone talking about the cost of this, any ideas?
People have been arguing as you are that x86's bloated CISC instruction set was inferior to a cleaner RISC architecture for the last 20+ years. Nobody has ever proven that the elegance of the instruction set matters with hard data though.
What evidence we do have goes against that argument.
The only evidence that we have is that the benefits of commoditization and economies of scale often outweigh any architectural advantages. The fact that x86 incorporated many elements of RISC would also demonstrate its value.
Apple machines used a cleaner RISC architecture for a while in the desktop space. They never performed any better than equivalent x86 based machines, and in the end Apple abandoned RISC and moved to x86.
Manufacturing processes simply trumped architectural differences. PowerPC's have never been manufactured on anywhere near the scale of x86.
Intel came out with a cleaner RISC based instruction set that that the Itanium line uses. If x86 was really as bad as you say, Itanium chips would be running circles around the x86 based server chips provided by both Intel and AMD. That isn't happenning.
Itanium is EPIC, not CISC. It is the exact opposite of RISC. It may not be running circles around x86, but that may be due to compilers not yet being advanced enough to take full advantage of the architecture. We may still see this change in the future.
http://astutehosting.com/
You explain away the fact that PowerPC never performed any better than the corresponding x86 counterparts as manufacturing differences yet fail to offer an explanation for why those two are related. Economies of scale mean you can make more chips cheaper. If you think that mass producing something magically makes an inferior product perform better then a product that isn't mass produced please explain how. THere isn't any industry I'm aware of where that happens. You might be trying to argue that PowerPC failed because it didn't have economies of scale and would likely be right. Nonetheless, if RISC was really so much better, it would have been performing better back when Apple was using it and it basically, didn't. No point in arguing about whether Itanium is RISC or not. It has some EPIC features, but it still uses a reduced instruction set architecture at its core. If you chose to not call that RISC, I am curious what you think RISC means.
In a minute there is time For decisions and revisions which a minute will reverse. -T.S. Eliot
I already have an 8-core machine, a Macpro, built from two, 4-core MPUs. And I do a lot with it.
Hopefully what this means is Apple will be releasing a 16-core Macpro. Yum! Some saving will be called for, though. [cough]
I've fallen off your lawn, and I can't get up.
all by itself
No sig today...
http://browse.geekbench.ca/geekbench2/view/198947
It will improve gaming performance if you happened to be running something like Quakes Wars in ray tracing.
Intel put together a demo on a workstation system with two Nehalem quad-core CPUs getting about 15 - 20 fps.
Since ray tracing is embarrassingly parallel, all one needs to do to improve performance is to throw more cores at it.
Keep in mind ray tracing is much more cpu intensive than gpu intensive...
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
Mmmm.... a bunch of HP DL360s with two of those in each. Yummmmm....
/ Server Nerd
Rounded corners are part of CSS3 and Webkit and Mozilla support it.
Oh and even the largest javascript libraries come in at 100kb, so where do you get 5mb from? Went trolling for it perhaps?
So, next time you complain, check your facts and use a real browser. Not the joke browser that came with your joke OS on your dad's Dell.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
Theoretically... EPIC > RISC > CISC... ...but it doesn't matter much in the real world. The small theoretical speed advantages get lost in the noise of compiler quality, product cycles, bus speed, programmer skill, etc. Underneath, the execution units are similar, and any brilliant breakthrough in EPIC compiler tech will just be copied into the x86 instruction decoder of x86 chips.
So basically, x86 is here to stay for a LOOOOOOOONG time.
Itanium is EPIC, not CISC. It is the exact opposite of RISC. It may not be running circles around x86, but that may be due to compilers not yet being advanced enough to take full advantage of the architecture. We may still see this change in the future.
Have you been in suspended animation for the last 10 years or are you just real fucking dumb? Do you even really know what a fucking compiler is? If you did, you would know that a compiler that advanced will appear at about the same time as better than human AI.
Although I've been marked as 'troll' (boo!), I'm pleased that some relevant points have been raised in reply. ;o)
I think we're still waiting for a unified breakthrough in core, memory, code, and data design; I can't say what those breakthroughs are, of course.
I'm not complaining about CISC vs RISC, but that our current memory architecture does not serve our instruction sets well, our layered and dynamically abstracted OO code is awkward to implement without queues of indirection, and our compilers are slaves to all the above.
</untroll>
With Supreme Commander 2 being the disappointment it is*, what's the need of 8-core? =-D
* It runs on XBOX. It's dumbed down and it looks and plays like SC&C ( Supreme Command and Conquer). Though the game is not all bad. It seems to be what C&C should have been.
urd
One muses whether or not this also is the upcoming end of the Xeon line ?
Religous speak to God. Insane are spoken to by God. When all shut up, one can finally hear Shostakovich in peace
try and take over the world!
The problem you are eluding to as I see it is not that the memory architecture does not serve the instruction sets. The problem is that latencies from the core to memory are very long and writing code that is tolerant of those kinds of latencies is either impossible (or impossibly difficult) to do for the majority of typical (desktop in particular) applications. Any code that has a lot of branches (think: any software you write that has a lot of "if" or "switch" statements) invariable ends up reaching the point where the processor cannot perform any more useful work toward running code until it evaluates the condition of the logical expression. And eventually, unless your code fits in the processor's cache, that means stalling and waiting on memory to retrieve some data.
Code needs to be very deeply pipelined to tolerate those kinds of latencies, for some applications that is easy to do, for some it is possible, but very difficult, for many, it simply isn't possible.
Real problem there is that processing speeds in the core have been increasing very fast over the last few decades, and memory technology simply isn't improving at the same rate.
If you look back to the early days (original Pentium was the last time this was true I believe) the speed of the processor and the depth of the processor pipelines actually did match the latency to memory and the cores were slow enough that they were able to absorb a memory latency hit without incurring long multi-cycle stalls. But now the cores are just too fast. Now, if your code needs to wait for a result from memory, its going to wait for a large number of cycles because the cores are so much faster.
In a minute there is time For decisions and revisions which a minute will reverse. -T.S. Eliot