48 Core Vega 2 in the Making

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Tuesday March 28, 2006 @08:37AM from the packed-in-like-sardines dept.

TobyKY76 writes to tell us The Inquirer is reporting that upstart Azul Systems is planning to integrate 48 cores on their next generation chip. From the article: "The first-generation Vega processor it designed has 24 cores but the firm expects to double that level of integration in systems generally available next year with the Vega 2, built on TSMC's 90nm process and squeezing in 812 million transistors. The progress means that Azul's Compute Appliances will offer up to 768-way symmetric multiprocessing."

8 of 206 comments (clear)

Min score:

Reason:

Sort:

AutoCAD by turtleAJ · 2006-03-28 08:40 · Score: 5, Funny

Behold the power of copy/paste!

Yeah, yeah... my Karma is SUPER negative...
48 cores is a nice start, but.... by Raul654 · 2006-03-28 08:40 · Score: 4, Informative

I know of a certain project that's working to put over a million cores into a system (160 into a single chip), and it should be finished and available off-the-shelf within a year or so.

--

To make laws that man cannot, and will not obey, serves to bring all law into contempt.
--E.C. Stanton
Finally! by lax-goalie · 2006-03-28 08:42 · Score: 4, Funny

Enough CPU power that even Microsoft Office will run with a little pep!
Blade servers by Seanasy · 2006-03-28 08:52 · Score: 4, Funny

So, chip manufacturer's have adopted the Gillette approach to marketing chips. I guess it was inevitable after they went from one core to two. The only difference, I expect, is that they'll increase by powers of 2. Soon, we'll have a Intel Mach 512 Core Sensor Extreme or something :P
The wiki link says 80 not 160 - read by joe545 · 2006-03-28 08:53 · Score: 4, Informative

"Each 64-bit Cyclops64 chip (processor) will run at 500 megahertz and contain 80 cores." While it may have two threads per core, that is not what you claimed. You stated "...that's working to put over a million cores into a system (160 into a single chip)". 160 threads per chip, yes, but not 160 cores.
So what's the memory model? by Tired+and+Emotional · 2006-03-28 09:07 · Score: 4, Insightful

So what does the memory interconnect look like on this thing? They say its not NUMA but I see no mention of what it is.

There's no way you can feed that many processors over a single bus and if you've got symmetric access to a bunch of busses, that's one heck of a cross bar switch and I don't see that its any easier to program than NUMA. Instead of making sure data you need fast is local you have to make sure you load balance - that has to be harder much of the time.

--
Squirrel!
Re:I don't know much about CPU internals but by AKAImBatman · 2006-03-28 09:13 · Score: 4, Informative
It would seem to me, that a CPU's workload is roughly limited by the number of transistors it has multiplied by it's MHz speed.

The number of transistors can go up for a variety of reasons. Chief among them is designs that utilize complex performance enhancements. To name a few:
- Superscalar processing
- Branch prediction
- Hyperthreading
- Out of order instructions
- Pipelining
The secondary source of transistor usage is coprocessors like Floating Point Units and SIMD Units.

The latest craze in processor design is to simplify the microprocessor back down to the most basic level. From there, the processors are ramped up through shear numbers of parallel pipelines (i.e. threads) and cores as opposed to ramping up the individual CPU horsepower. These multi-core chips typically share coprocessors among a pipelines or cores, and may even have entire cores dedictated to specific tasks like SIMD. As a result, a properly designed program will be able to execute within a very short period of time, thanks to the parallel nature of the multi-core architecture.

Now the only problem is in finding these "properly written programs".
--
Javascript + Nintendo DSi = DSiCade
Re:Memory interface by CliffClick · 2006-03-28 11:22 · Score: 5, Informative

The box is a flat SMP - if a core misses in L2 it's the same cost to any piece of memory (or remote L2).

The cores are our own design, not MIPs, not ARM, etc. Simple, short in-order pipeline, decent caches (not huge) caches.

Power consumption is very low compared to the equivalent stack of P4 blades or other main-frame solution.

The first-gen box (368 cores) is about 2700 watts in an 11U rack mount.
Next-gen box isn't much bigger, nor draws very much more power (a little more of both I belive).