Intel's "Terascale" Vision
Vigile writes, "Intel is pushing the envelope with its latest vision — 80 cores on a single processor. Dubbed 'Terascale' computing, Intel aims to bring low-powered, massively interconnected cores and unleash a new era in data-mining, media creation, and entertainment." For balance, read Tom Yager over at InfoWorld imploring AMD to stop at 8 cores while everybody gets the architecture right.
Now I can run 80 instances of Doom at the same time. Nothing quite like heavy multitasking.
Palm trees and 8
we are on our way to L-Cars computers i can feel it.
CH
Yes, remember the days when Intel used to accuse AMD of copying everything?
Now it looks that Intel has to take a page from the SUN Niagra roadmap for inspiration.
Ahh, the good old days.
dupe
doioioioioi
This processor must already be submitting stories... If it is there should be 78 more dupes just like it.
I like the idea of an 80 core processor. Multithreaded applications will work better. Why are people afraid of multiprocessors? Systems with dozens of processors are not uncommon. I dont see why it would be bad for the desktop.
http://github.com/gbook/nidb
Anyone else first read that as "Intel's Testicle Vision"?
Man, it's been a long day.
What if the Hokey Pokey really is what it's all about?
More Cores = More Dupes?
When you can have 80 underfed chickens?
Reality is nothing but a collective hunch.
That's what Zeus sucks!
i always thought that beowulf was a great story? once you get past the prose.
\.
Gilette is releasing a new shaver called the "Plutonium Mach80", a razor with 80 blades. Each blade has a separate distinct function, and you can get even closer shaves with the synergistic cuisinart action. Also comes in a "For Women" model for "sensitive areas". "Basically, 5 blades isn't enough. I mean, really, more is better, right?", says Gilette CEO James Kilts. Schick is reportedly working on a competitor blade that may exceed the legendary "100 blade barrier".
160-Cores that way I'd have Tera-Terascale Server
And slashdotters will still be overclocking the sumbitch.
Insert witty sig here.
Thinking about of multicore processors. Its performance is the first thing to ponder about...Imagine how it will ease our lives...while one processor core is updating the terminal display, another processor core could be tasked with processing the user input. As servers they can handle better backend processing. For example where there are transactions arriving from the various Point-Of-Sale terminals. By taking advantage of multi-core processor, any one server could handle a greater number of transactions in the desired response time. Its Gonna be real Fast and efficient..:)
Didn't Sun try that sort of idea with the UltraSparc T1? If I recall correctly, while the concept of lots of light cores was cool, the real-world performance didn't do any better than Intel- or AMD-based systems.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
If they succeed, does this meen the tera-rists have won?
A lab prototype like this can help them with something important: Given multi-core processors look to be the way future computers will be built, how do you feed them data? The current paradigm won't scale past 4 cores on a single chip's worth of FSB, and there are folks who don't think that even 4's going to be a useful increase over 2.
Even if Intel never sells a chip bigger than 16 or 32 ways, an 80 core lab mule will teach them many things about how to get information to a processor and keep those caches full of appropriate data.
-F
If you are going to have 80 cores on a chip ..specialize some of the cores. Ie, have a few physics processors, a couple of graphics processors, maybe one dedicated to search tree retrievals even. One or two that are highly media centric. So basically 64 cores for general purpose CPU and the other 16 for known common tasks such as graphics and hdtv/media processing.
This would eliminate the need for a separate graphics card for the average business person and bring down costs.
Then, 80 cores should be enough for most people(TM).
For home PC's another option would be to have a Gigabyte of L2 cache, so that you are able to store the entire operating system image right next to the CPU. That would probably speed things up more than 8 cores would.
With 80 friggin cores, you'd darn well better get the architecture right. But that's unlikely to happen beyond the memory subsystem.
What REALLY needs to happen is to rearchitect the IO subsystems. Ala the (gasp, heresy!) mainframe enviroment. And before you go writing off that idea, consider that mainframes can handle 65535 separate IO devices. Refer to the article on IBM's channel architecture if you're interested. As everyone knows, a PC is lucky if it can handle 65 devices.
The PCI and Hypertransport approaches just won't cut it. Radical thinking (or reinventing, more likely) is needed with this kind of horsepower.
If Intel, or AMD, ever figures out how to handle that type of IO, instead of the rediculously small number of devices PC's are currently limited to, THEN we'll see a real shake up in this industry.
This must be some new definition of tera. From http://en.wikipedia.org/wiki/Tera
tera- (symbol: T) is a prefix in the SI system of units denoting 10^12, or 1 000 000 000 000.
That seems to leave Intel 999,999,999,920 cores short.
This is the sort of hardware environment that would allow a programming language like Erlang to thrive. We could very well see it being used more and more often in the future, as even regular consumers have access to machines with well over 16 cores.
For many applications, Erlang provides a far superior model to that of languages like C and C++ (with pthreads), Java, C# and Perl when it comes to massively multithreaded programming. Very high reliability is possible using Erlang, as witnessed by the many telephony products in which it is used. So for consumer applications, it could help developers build very solid systems that easily take advantage of many processors.
Practicality and usefulness problems aside, you can fit over 6,000 6502 processors in the space of a P4, each running at several ghz.
That looked like it said Testicle computing.
"Speaking the Truth in times of universal deceit is a revolutionary act." -- George Orwell
The first 80-core chip will actually look live a conventional kitchen hotplate. You add a pot of cold water on top of the chip, then with a dial on the unit you determie how much heat you want to produce. The CPU will automatically run the correct number of instances of Seti@Home to generate the desired level of heat.
The 4 X 80 "stove top" model will come out later that year. It will include an "oven" that has its own chip and convectional cooling.
I emplore Tom Yager to STFU and step out of the Intel/AMD competition. If Tom gets his way, neither chip vendor will see the newly paved CPU autobahn as an actual challenge. Each vendor has their own competing architecture and that's what makes things fun. If we were back in the older days of the dozens of i486 arch clones, things would not be so interesting. Anyone could stamp out one of those chips.
I think competition is a good thing. If AMD releases an 8-way, then Intel should release a 16-way. Let them compete. Consumers win with the low price wars they bring.
Shut up, Tom.
- Just my $0.02, take with a grain of salt, your mileage may vary.
AMD: We now have two cores, so there!
Intel: Oh yeah, well we now have four cores- losers!
AMD: Oh yeah, well we're coming out with eight cores next. Ha beat that!
Intel: We can and will! We're going to come out with, with EIGHTY cores! Yeah that's right, eighty cores!
Disclaimer: I've not kept up on the Core War, so any inaccuracies are for dramatic effect...
If brevity is the soul of wit, then how does one explain Twitter?
The 80 cores are all simple floating point cores. A lot like the IBM/Sony Cell.
It is of interest for say super computers and video cards. It isn't the prototype of the Octodec80Core that will be in the new 72" iMac.
Yea it is a dupe alright.
What I think a lot of people are missing is that it almost looks like Intel is going to repeat the mistakes with Netburst all over again.
Now instead of a clock speed race Intel is starting a core race.
Intel is sticking more and more cores onto it's current FSB. This is going to bit them just like the clock speed race did.
Instead of the way to long pipeline of the P4 will have some really sick and twisted l1,l2,l3, and for all I know an l4 cache just to keep the cores from being memory staved.
If they do not watch it they will have 16 core systems that are slower than AMDs 8 core system. AMDs cpus scale better than Intel's thanks to the integrated memory controller and Hyper-transport links. I fear that Intel will have to follow AMD's lead yet again.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Now, imagine a beowulf cluster of those NOT running one instance of Vista!
I fear the Y2038 bug
I'm a video guy. I can't render video fast enough. I can't do transcoding fast enough. My video is getting larger and deeper in color, and i need more power.
all of that is threadable.
so is photographic processing. You can divide a picture 80 ways and have each processor do whatever it is you want to do on it.
Gamers? Fscking a.... i'm so SICK of hearing hiow everything is for them. Just because something isn't going to help Halo Life 3 run faster is not any of my concern.
There are lots of people working on their computers that want to see more cores because it will make our lives better.
guns kill people like spoons make Rosie O'Donnell fat.
That didn't work because AMD worked out that architecture can trump speed. They innovated, and then did it again with decent dual-core (as in NOT the two-dies-on-one-chip cack that you churned out at first).
So, you improved your architecture and implemented dual-core properly, to produce the fantastic Duo. You got back in the race.
And then there was talk of more cores. And you went "Fuck that, bitches, stay DOWN - we is gon' fuck you up good with 80 cores, bitch, an' dat hard!". Yes, you decided to try and dominate the pissing contest of multi-core instead of megahurtz.
Jesus guys, didn't you learn a fucking thing? STOP trying to turn out something that little bit "more" than the competition, just get on with innovating and coming up with damn good chips. That's how AMD threatened you and, if you go on with this "anything you can do" shit again, you'll be back to square one.
Meta will eat itself
I would prefer to see more bandwidth on the bus then more cores/gigahertz in the CPU. Let's get away from 15x multipliers and get back to 2x multipliers, bring the rest of the system more in line with the CPU before we start scaling the cores out.
oh oh oh, so THAT is what they are waiting for before releasing Duke Nukem Forever? Eye candy that requires a minimum of 80 cores MUST be good!
Tequila: It's not just for breakfast anymore!
average consumer. Lets face it, most of us aren't going to see anything along these lines for quite some time. So while I give a nod at the excitement surrounding the technical research going on here, it isn't going to do much for me having to run bloated software on my work computer. And we certainly aren't going to see a "Beowulf full of these" anytime soon.
That said, I think that the benefit these will have on the scientific community (as well as servers, etc potentially) will be quite high and some time in a dozen or so years there will be practical products developed from this technology.
For some reason when I saw this initially I wondered if they might have some kind of potential use in artificial intelligence. For one the processing power available, but perhaps the ability to leverage the various cores will bring about some efficiencies that make it more practical.
Justin - Don't be afraid of my blog, it won't bite.
What's a memory bus? Oh right, that thing you use to access the DDR4 swap device when the page you want to access is no longer in the on-CPU RAM. ;-)
Seriously, look at the growth of L2 caches, and tell me the day isn't coming when they just call it "RAM" instead of "cache." If Intel and AMD want to keep piling transistors onto their chips, this'll give 'em something to do.
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
...why is everyone so nut's about the CPU it's just one part of the whole thing. A central part granted, but CPU capacity currently isn't the bottleneck or did I miss something.
I am not good at the hardware stuff but multiplying the cores again and again does not seem to be a revolutionary strategy. Or is it? (That's a real question.)
Or do we get all this core race just because that's were the publicity is and AMD and Intel need the media attention for their stock-market price?
640 cores should be enough for anyone.
Zen exercise for today:
"Wow, don't imagine a Beowulf cluster of these!"
Why is it that with intel talking about a radical change in consumer hardware the level of comments on /. is barely higher then that on AOL.
:-) Of course a nice message passing symbolic language might score big.
We have had multi processor machines for ages. This is not a sudden unknown. Look up transputer, connection machine, beowulf, cray. There is still ground to be covered but it's not unkown territory. The difference is this is intel, intel needs a big market to sell to.
This is not going to make significant difference to the end user, most of them will still write letters, calculate spreadsheets and browse the web. It might be enough to finally expose MS et al for what they have always been, the parasites.
Where this is going to hit home is in the realm of programming and OS.
Want to run an OS primarily designed for uniprocessing on a multi way architecture? Look at the issues Win&Lin have with SMP, limited to 16 processors I believe. Numa and beowulf are a different kettle of fish. So what will we have on these massive SMP architectures?
Programming, at last we might be getting out from under VonNuman. Progress might be possible after 30+ years of stagnation. The symbolic/functional languages are going to start to move forward. Hell we might even get to run on stack based cpus with energy reclamation automated
But given then history of software we'll have a bunch of ignorant, loud mouth idiots running around telling everybody the one true way is Java with mutex and semaphores. PHBs will grab at the first thing that has enterpise written on it and is 'guaranteed'. Most programmers will code how they have always coded head down, ass up. The number of processors will double every two years and the speed of software will continue to halve in the same period.
Of course nobody will suggest that a staged conversion should take place. There will be all these reasons to throw everything away and start over. Because this time we'll get it right!
I expect we're going to need some progress from the fan guys.
"Refer to the article on IBM's channel architecture if you're interested."
I had meant to refer to the article on wikipedia. They have a good overview of the approach. Supposedly, IBM is now moving completely away from interrupts as well, as they take too much overhead with that number of devices; and are instead moving towards a polled approach. At least that's what I've heard; I haven't seen this for myself, so I can't confirm it.
These are interesting times we live in.
Having all the cores on one board like this rather reminds me of EFF's Deep Crack.
http://en.wikipedia.org/wiki/EFF_DES_cracker
I know it is a completely different thing, but the EFF put a bunch of custom chips on boards (60+) and could crack the "then" nearly uncrackable DES crypto in a matter of 5 days.
That said, anybody want to bet the NSA is going to be the first people on the list when these things become practical in another couple years?
Justin - Don't be afraid of my blog, it won't bite.
Depending on the nature of your workload, a 400 object Field Programmable Object Array (object ~ core) might be a better choice than a tradition CPU.
Didn't we all argue about this already yesterday.
Engineering is the art of compromise.
Bandwitdh, shmandwidth.
If data I/O is bound to a rotating disk platter, there's yer bottleneck.
Where's my 200 + GB RAM Array that loads the entire hard drive into memory, runs everything from there, then writes it all back to the hard drive whenever it has a moment (or just before shutdown)?
CPU, cores, bus bandwidth, bus speed, reduced memory latency and all that is great, I mean really great. But if it all eventually has to wait on a mechanical hard drive, who cares?
I can see it all now . . . "My system can wait on its hard drive I/O faster than your system!"
Data mining? How does having 80 cores improve I/O?
In the course of every project, it will become necessary to shoot the scientists and begin production.
Perhaps someday a game will have the minimum requirements of an 80 core processor.
I will bend like a reed in the wind.
Gillette is approaching a Singularity... http://www.collisiondetection.net/mt/archives/2006 /06/the_gillette_si.html
It doesn't matter which ape activates the Monolith
...that the average (l)users comp that I have to fix wont be frozen from the 76 processes of spyware running and that I can actually use the system and repair it there?
Why did kdawson add that to the submission? Since when has Slashdot done that for AMD press releases?
"Sufferin' succotash."
The average buyer will not understand "out of order architecture" anyway. The MHz race was different, because even non-techies could see how the computers got faster with increasing clock speed.
But now?
Maybe it will be "number of cores". Otherwise Intel and AMD will have to use meaningless slogans like "Intel inside" to suggest a sense of security when using their particular brand.
I expect a mixture of touting lots of cores and almost-fraudulent crap like "the Pentium III will make your internet faster".
C - the footgun of programming languages
I think the important thing in the announcement is not the 80-core thing, but the idea of a memory chip sandwich. What was described is attaching the chips with what would be several thousand connection points giving more than a terabyte per second aggregate bandwidth. I heard (I watched the presentation) each core would have 256 megabytes dedicated memory.
Assuming this memory could be used smartly, segregating incoherent memory spaces (it seems rather obvious the dedicated memory would not be a coherent image of the main memory, or we would do away with any gains we got from the scale), the chip could achieve huge throughput.
And, about Tom Yager saying Intel and AMD should stop inovating... Well... Diversity is one of the two main tools of natural evolution. It served us well in the 70s and 80s and, if we can have some more of it, I say it will be great.
Send in the Windows-proof architectures.
http://www.dieblinkenlights.com
They have a slide that matches successive levels of application demand with: Text, Multimedia, Video & 3D, RMS.
Okay, so I understand that AI is more compute-intensive than video. And I understand that it could be easier (tera instead of peta) if social reasoning isn't included. But really, Intel, I just don't want RMS on my computer.
Also, the jump from nanoscale to terascale may be impressive, but I don't think it'd be useful to have a transistor with a 310-million-mile-wide gate. Your device isn't going to be useful if it doesn't fit inside earth's orbit.
Throwing hardware at a badly performing application is usually the wrong way to go about getting better performance. The performance gains you're going to get are usually marginal unless there's some gross configuration mistake. The biggest performance gains are made by making changes to the application. Count that as another reason to investigate free software.
Deleted
I think Intel is trying to sucker AMD into wasting tons of time and money trying to compete in a useless core war. Intel's architecture is behind and it needs to catch up. Intel can probably invest in two parallel development efforts easier than AMD due to it's larger size. If AMD takes the bait, it may be a big mistake. This is similar to the GHz war.
Intel(and AMD for that matter) need to design some sort of application layer that handles parcing out tasks to the various cores regardless of the number of them. The biggest problem with multi-core applications right now is many many programs simply don't take multiple cores into account. In addition, this is going to become a huge hassle for future programmers unless this is done: "Well how many cores are we going to write this program to take advantage of?".
Also, this is something that intel/amd are definitely going to have to do on their own. They simply can't leave it up to the operating system makers to create this. I mean, look at Microsoft's 64bit Windows Vista for a great example for why you can't leave it to the OS people to do.
Until something like this is done, I doubt you'll see much enthusiasm beyond dual and quad core processors because it will take too much effort to tailor software for x number of cores with x changing on a monthly basis.
You are who you are, let no one tell you different. But, never close your mind to a new point of view.
Why doesnt the dupe tag work? Someone has obviously got around it by using dup. I liked the dupe tag on an article, it meant i wouldnt have to look at 5000 comments all going "omg! its a dupe!" and "thanks to this processor we'll see this article another 78 times!"
In Soviet Russia the insensitive clod is YOU!
from http://www.informationweek.com/shared/printableArt icle.jhtml?articleID=191901844
"I've always been amazed at the Apollo spacecraft guidance system, built by the MIT Instrumentation Lab. In 1969, this software got Apollo 11 to the moon, detached the lunar module, landed it on the moon's surface, and brought three astronauts home. It had to function on the tiny amount of memory available in the onboard Raytheon computer--it carried 8 Kbytes, not enough for a printer driver these days. And there wouldn't be time to reboot in case of system failure when the craft made re-entry. It's just as well Windows wasn't available for the job. The Apollo guidance system probably seems like routine software to technology sophisticates. Far more complex navigational systems are in operation today. The system's essentials were a few well-known algorithms based on proven logic. But to me, it's still rocket science. Great software dazzles us by virtue of what it does correctly in the face of everything that could go wrong."
Whow, can you do that without 80 cores?
2+2 = 5 (for very large values of 2)
I like the idea of an 80 core processor. Multithreaded applications will work better
Multithreading models from the Windows/Unix/Linux community all assume equal access to system resources such as memory across all threads. They like Uniform Memory Architecture models.
An 80 core system can't really provide a uniform memory access model, as it runs into severe switching and coherency problems. (You want to snoop HOW MANY L1 caches?!??). Fancy interconnects like hyperchannel and Monte Carlo stochastic schemes start getting pinched for bandwidth around 8 cores. With this many cores, you'll wind up with computing meshes of local processors and memory interconnected using some interesting switching scheme. The article even mentions this, with a bit of hand-waving over the issues of bandwidth in shared system resources. "Intel's answer is to attach 256 Mbit of SRAM directly to EACH core. " Interconnect topology is left at a simple tiling scheme, but they are exploring ring topologies.
The result looks remarkably like a transputer mesh. I've programmed these in the past, and the model is rather different than simple multithreading. Being able to decompose the programming problem into a number of independent steps with relatively low communications demands is essential. The ability to reconfigure the interconnect topology to match the problem's data flow is essential to being able to get as much out of the processor set as possible. Without this, one can wind up with lots of idle processors, blocked on data starvation.
Comment removed based on user account deletion
I have a feeling that AMD is going to come out of nowhere and make something that Intel will not be able to beat. AMD has been quite... too quite. Also AMD is very carful about how there priceing is set.. Intel is gooing to be big $$$ GO AMD!!! 80 cores... Make optical CPUs
Back in April, the semi-reliable (rumor-wise and server-wise) Mac OS Rumors claimed that 10.5 "Leopard" would have some pretty cool "thread farming" technology. I'll quote the whole page:
Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
Now, of course, there are data dependency and variable interaction problems to handle (which OpenMP provides facilities for), but, if Blah.DoSomethingUniquetoThisElement() were actually your work, you'd be done with this code. It's really that easy.
The problem is not the number of processors but actually making use of them, so you *can* have enough cores. There is no point having an 80 core system if ony two of the cores are being used.
What Tom Yager is saying is that AMD & Intel makers shouldn't just fall into a race to see how many cores they can fit on a chip but actually getting architecture and software that ensures that adding new cores does actually give a performance advantage.
If this were really happening, what would you think?
is MRI for Magneto Rotational?
...
I have seen PMatlab, MatlabMPI paper recently ( http://www.ll.mit.edu/MatlabMPI/ )
Though have no time to implement/check it
Maybe I'm a throwback, but I'm still more interested in speeding up a single thread than in having 80
seperate ones. It's fundamentally harder, but that's why it's useful, no?
There have been several super computer companies who have come and gone on the premise of fine-grain parallel processing, such as mentioned in the title. All of them used some version of UNIX (CM had a pre-UNIX OS at one time). The reason for this was that UNIX was the "Linux" of the 1980s and early 1990s: low cost source code license, porting experience to many machines etc. They all had OS-extensions and C-language constructs for managing fine-grain parallelism. So my point is there is a lot of experience out there int his area.
These companies died due to using custom hardware they could upgrade only in 3-5 year generations. Clustered commodity workstation/PC CPUs generally upgraded 3-5 times faster and "caught up" in price/performance.
If a cell-processor can emulate the x86 instruction set fairly transparently, then they could finally beat this fine-grained jinx.
I am worried for Apple.
...Or the OS running on it.
I hope they don't get suckered into this wild Intel adventure...
I hope Steve Jobs doesn't buy into this and instead focusses on getting Mac OS X to scale well on 2 - 4 cores.. (perhaps 8 cores for OS X Server).
IMHO, a normal consumer should NOT need more than 2 cores.
And they do, it indicates that there is something wrong with the 2 cores that are there
I somehow am *very* cynical about new throwing a few dozen cores in the chip trend.
Just-a-Jester