Intel Reveals More Larrabee Architecture Details
Ninjakicks writes "Intel is presenting a paper at the SIGGRAPH 2008 industry conference in
Los Angeles on Aug. 12 that describes features and capabilities of its
first-ever forthcoming many-core architecture, codenamed Larrabee.
Details unveiled in the SIGGRAPH paper include a new approach to the
software rendering 3-D pipeline, a many-core programming model and
performance analysis for several applications. Initial product
implementations of the Larrabee architecture will target discrete graphics
applications, support DirectX and OpenGL, and run existing games and programs.
Additionally, a broad potential range of highly parallel applications including
scientific and engineering software will benefit from the Larrabee native C/C++
programming model."
With the supposed death of Usenet, the closing of PARC, and the general Facebookification of the Internet, its nice to see a bunch of nerds get together and geek out simply for the sake of it.
I want to delete my account but Slashdot doesn't allow it.
With more and more emphasis going toward GPUs and other specialized processors, I wonder if this is to try to fight that trend and have Intel processors able to handle the whole computer again.
This is good news for Mac mini and MacBook users.
How so? Has Apple announced that it will adopt Larrabee for the Mac Mini or the MacBook? No. All you have are rumors and speculation by MacRumors and Ars Technica. When Apple says they will adopt the Larrabee GPU, then you can say that it is good news for Mac users of any stripe. Until then, it's just Intel news, not Apple news.
My blog
Neither the summary nor TFA itself mentions the words "Ray Tracing" or "Rasterization".
Am I missing something here?
I think it depends on how much Larrabee will cost, however with what we know so far Apple seems to be heading into multi-CPU architectures, so using Larrabee would make sense.
This is good news for Mac mini and MacBook users. But I can't stand them.
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
You know I'm turning Japanese, you know I'm turning Japanese you know I think so. (FWIW, the parent is probably referring to the word 'Facebookification' because he apparently has that English is slowly adopting the Japanese practice of positional grammar by verbifying nouns and nounifying verbs.
My blog
I get a warm-fuzzy feeling seeing that OpenGL isn't dead. I was first and best impressed with it when I played NeverWinter Nights, why hasn't it caught on more? Why don't more Open Source Games use it (as opposed to reusing the Quake engine)?
"The price good men pay for indifference to public affairs is to be ruled by evil men." ~Plato (427-347 BC)
I think it depends on how much Larrabee will cost, however with what we know so far Apple seems to be heading into multi-CPU architectures, so using Larrabee would make sense.
Larrabee costs somewhere between 150 and 300 Watt, so MacBooks and Mac Minis are not likely to use them. Mac Pro, on the other hand, possibly.
It almost certainly won't work. In the past, there has been a swing between general and special purpose hardware. General purpose is cheaper, special purpose is faster. When general purpose catches up with 'fast enough' then the special purpose dies. The difference now is that 'cheap' doesn't just mean 'low cost' it also means 'low power consumption,' and special-purpose hardware is always lower power than general-purpose hardware used for the same purpose (and can be turned off completely when not in use).
If you look at something like TI's ARM cores, they have a fairly simple CPU and a whole load of specialist DSPs and DSP-like parts that can be turned on and off independently.
I am TheRaven on Soylent News
"its first-ever forthcoming many-core architecture, codenamed Larrabee" The Core architecture has duos and quads. Nehalem is just about to launch, going up to octocores at least. The point of the article eluded me until I went to Wikipedia and discovered that the Larrabee being talked about is a *GPU* rather than a CPU. Could have used that information somewhere in the original post.
Bearing in mind all the other promises Intel has made about their previous graphics offerings, I'm rather inclined to think that once again this will underwhelm. Especially considering all the crap that's been coming out of Intel about real-time raytracing. (It's always been just around the corner because rasterisation always gets faster.)
That's not to say that it's an interesting bit of tech, but from what I've seen so far it looks like the x86 version of Cell. Of course though it's a PC part and won't be showing up in any consoles anytime soon, so as a console developer it doesn't really do anything for me. I'm mostly interested in how they'll handle memory bandwidth.
I also expect that nVidia will put out something within 12 months that will stomp its guts out.
Comment removed based on user account deletion
Is it not also good news for Windows users, Linux users, and *BSD users? I mean, it's likely that these OSes will also be made to make use of Larrabee when the technology is released, right? Yet, it's not news for any of those platforms or Apple users unless/until those platforms are able to make use of the new GPU technology. Everything else is just speculation, especially so for Apple, who might easily decide not use Larrabee. Since Apple is the only legit supplier of Mac OS X hardware, it's definitely not news for Apple users until Apple says it is. OTOH, Windows, Linux and *BSD users can get their hardware from any supplier.
My blog
there is like a LOT of computers with really good cpus and really weak video chips like laptops and dell computers
Why not just do a software mode driver for em?
that probably would make the 3D gaming market a bit bigger without forcing the people to buy a 3D acelerator card (thing that is kinda impossible to do on most laptops)
damn! You could poweer a 10-20 ARM or PPC multiprocessor unit. And the architecture wouldn't suck cowboy neal's sweaty balls.
I was actually trolling...
Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
I don't think so. I think the fact is that with the right architecture (which Intel is trying to get into place) which exact core on which processor handles a specific task should become less and less relevant.
What this technology will hopefully provide will be the ability to have a more flexible machine which can task cores for graphics, then re-task them for other needs as they come up. Your serious gamers and rendering heads will still have high end graphics cards, but this would allow more flexibility for the "generic" business build PC's.
I'm a fiscal conservative, it's a pity we don't have a political party anymore
It almost certainly won't work. In the past, there has been a swing between general and special purpose hardware.
Except with unified shaders and earlier variations the GPU isn't that "special purpose" anymore. It's basicly an array of very small processors that individually are fairly general. Sure, they won't be CPUs, but I wouldn't be surprised if Intel could specialize their CPUs and make them into a competitive GPU. At the very least, good enough to eat a serious chunk upwards in the graphics market, as they're already big on integrated graphics.
Live today, because you never know what tomorrow brings
..a, uh, beowulf cluster...I just can't put my heart into it anymore!
Tic-Tac-Toe, Global Thermonuclear War, and relationships all have the same winning move.
Today at a coder's party we had a discussion about Intel's miserable corporate communications.
Intel's introduction of "Larrabee" is an example. Where will it be used? Only in high-end gaming computers and graphics workstations? Will Larrabee provide video adapters for mid-range business desktop computers?
I'm not the only one who thinks Intel has done a terrible job communicating about Larrabee. See the ArsTechnica article, Clearing up the confusion over Intel's Larrabee. Quote: "When Intel's Pat Gelsinger finally acknowledged the existence of Larrabee at last week's IDF, he didn't exactly clear up very much about the project. In fact, some of his comments left close Larrabee-watchers more confused about the scope and nature of the project than ever before."
The Wikipedia entry about Larrabee is somewhat helpful. But I don't see anything which would help me understand the cost of the low-end Larrabee projects.
This only goes to show that the people at Intel really can't count..
(Firmly tongue in cheeck, of course :)
Coz eternity my friend, is a long *ing time.
Your comment, "... as they're already big on integrated graphics." is true for some values of "big". Intel has been big in integrated graphics the way a dead whale is big on the beach.
Basically, once you discover what Intel graphics has not been able to do, you buy an ATI or Nvidia graphics card.
How's ARM at floating point vector math these days?
The power brick for my Core 2 Duo Mac mini is somewhere around 80 Watts I think. And I'd assume the actual usage is lower than that. Let's say 50~60 Watts for the whole computer (CPU, GPU, hard drive, optical drive, RAM, FireWire, USB, etc).
If Larrabee takes 150~300 Watts, then it's just insane, no matter how many cores it has.
More people in the world need Intel level graphics than need ATI/NVIDIA. This is borne out in sales numbers - Intel is the #1 graphics chip maker and has been so for many years.
No sig today...
*Additionally, a broad potential range of highly parallel applications including scientific and engineering software will benefit from the Larrabee native C/C++ programming model."*
Can someone explain how the C/C++ programming model is compatibility with a many core system?
This is the most interesting statement, but I am completely unaware of how it is even somewhat true. In my experience the C/C++ languages have little to no native support for multiple threads (short of some enhancements in the new standard which there is no support for yet). All multi-threaded C/C++ programs rely heavily on OS specific extensions, not on any programming model in the languages...
With the vector floating point (VFP) coprocessor it's not too shabby.
Except the part where GPUs have 256-512 bit wide, 2GHz + dedicated memory interfaces and Intel processors are...way, way less. Add that to the ability to write tight code on a GPU that efficiently uses caching and doesn't waste a cycle, compared to the near impossibility of writing such code on the host processor which you share with an OS and other apps... meh.
There might be some good stuff that can be done with this architecture, but I am not convinced it's a competitor to GPUs pound for pound. You have to really believe ray-tracing is the future, and that some of the multi-texturing shenanigans that drive memory bwidth in GPUs are in the past. That's a big leap of faith. I'd prefer to believe once they build it, we'll find a great use for it.
Still it's nice to see something new happening.
What most people don't seem to realize is that Larabee is not about winning the 3d performance crown. Rather, it is an attempt to change the playground: you aren't buying a 3d card for games. You are buying a "PC accelerator" that can do physics, video, 3d sound, dolby decoding/encoding etc. Instead of just having SSE/MMX on chip, you now get a complete separate chip. AMD and NVIDIA already try to do this with their respective efforts (CUDA etc), but Larabee will be much more programmable and will really pwn for massively parallel tasks. Furthermore, you can plug in as many Larabees as you want, no need for SLI/crossfire. You just add cores/chip like we now add memory.
P.
Intel graphics has been TERRIBLE. We buy ATI video adapters (about $20) to put in business computers we build. (We've never bought from eWiz.com, or the particular video cards shown. That is just an example.)
> so far it looks like the x86 version of Cell
Then you missed the fact that the article says it uses a coherent 2-level cache for inter-core communications; the Cell BE is quite exotic in that it uses DMA transfers and has no memory coherency between the SPEs.
The article doesn't explicitly state that the Larrabee cores are homogeneous, but I would be surprised if they weren't; the Cell cores are somewhat heterogeneous if you want to use the PowerPC core to squeeze the last drop of processing power out of it.
You are correct in that Intel appears to have copied the ring network of the Cell BE, although I don't understand why they need it in addition to the coherent cache. Oh, well, guess I'll have to wait until the paper really hits the public.
960 cores, 4 teraflops, 400 GB/s memory bandwidth in a 1U rackmount: nVidia Tesla S1070
What'll be more interesting is if it fragments the PC market.
If you want a super-fast ray-tr, erm, protein folding application you need one with the Larrabee chipset. If you want to play the latest game you'll need a traditional PC + graphics card. Would it be possible that business PCs turn to Larrabee and home PCs stick with current architectures?
Larrabee looks very interesting for scientific computing, but what makes it better for graphics than a ATi/nVIDIA GPU?
This post climbed Mt. Washington.
Does this mean that Norton will scan my drive in 3D?
Seriously, manymanycores architectures are nice for public servers that are coded very well. Potentially able to serve N clients at once, the machines running Larrabees will usually bottleneck somewhere else.
For the desktop user, manymanycores mean that the main window will move smoothly in the foreground while anti-blackware, indexes and updates consume the background.
For the power gamer, even manymanycores won't be enough. There's no such thing as "enough".
Working to work less.
General purpose is cheaper, special purpose is faster.
Only sort of. Special purpose is often cheaper, hence the profusion of ASICS. General purpose is more flexible, and so more desirable as a result. Also, special purpose is only cheaper if "general purpose" isn't quite up to the task. Speical purpose is also only cheaper if you're doing it all the time.
For instance, on the low end, MP3 players often have (had?) MP3 decoder ASICS, because it was too expensive to perform on the very small CPU. On a PC, there's no point. Even though using an ASIC would be cheaper for just decoding MP3s (use less power, free up the required CPU, etc), it isn't worth it since the CPU is fast enough, and doesn't spend all the time playing MP3s.
SJW n. One who posts facts.
But can intel make good drivers as there on board ones suck?
There on board video cards look good on paper but then come in dead last next to nvidia and ati on board video and that is with out use side port ram. ATI new board video can use side port ram.
Last I heard, Tom Forsyth and Michael Abrash were writing the graphics "drivers". So I expect good things.
I live in Larrabee, IA and I'm getting a kick out of these replies . . .
Its architecture could (potentially) make for better multi-GPU solutions (i.e. with a shared frame buffer across all cores instead of x amount of RAM per GPU), and the use of tile-based rendering has a fair amount of efficiency benefits to make it interesting.
It's way too early to say whether it'll even be equivalent performance-wise to AMD and NVIDIA's GPU designs in Larrabee's release time frame, and it'll be very dependant on its compiler and drivers, but as a concept right now it's hugely interesting in a number of ways as a pure graphics architecture alone.
That is much more detailed than the one linked in the article summary. It can be found here.
You keep on using that word, I do not think it means what you think it means.
The biggest debate in all of graphics-dom [graphixery?] for the last six months or a year has been Ray Tracing -vs- Rasterization.
So what happened?
I just don't understand how you can have an article about next-generation GPU tech and not ask whether the logic gates & data busses are going to be optimized for Ray Tracing or for Rasterization or for both [which would require at least twice the silicon, if not twice the wattage and twice the heat dispensation].
Has Intel completely abandoned the idea of optimizing silicon for Ray Tracing, and returned to what essentially amounts to software-based graphics, or is there a "Field Programmable" aspect to Larrabee which would allow someone [the programmer who writes the driver, maybe?] to choose how he wants the silicon to be optimized?
Mac minis integrate 2.5 inch hard disk drives (ATA in the G4 models and SATA in the Intel models), CPUs and other components originally intended for mobile devices, such as laptops, contrary to regular desktop computers which use lower cost, but less compact and power-saving components. These mobile components help lower power consumption: According to data on the Apple web site, first-generation PowerPC Mac minis consume 32 to 85 Watts, while later Intel Core machines consume 23 to 110 Watts. By comparison, a contemporary Mac Pro with quad-core 2.66 GHz processors consumes 171 to 250 Watts.
On the Oregon Cost born and raised, On the beach is where I spent most of my days
FTA: "The Larrabee architecture supports four execution threads per core with separate register sets per thread. This allows the use of a simple efficient in-order pipeline, but retains many of the latency-hiding benefits of more complex out-of-order pipelines when running highly parallel applications."
Funny how Intel made fun of Sun's Niagara, yet the above is appears to be designed very similarly, if not exactly the same.
Except with unified shaders and earlier variations the GPU isn't that "special purpose" anymore. It's basicly an array of very small processors that individually are fairly general.
Even with all the advances in shaders GPU's are not quite generalized due to several reasons. Hardcoded data fetch logic (yes there is some support for more arbitrary memory reados but those are limited and take a fairly big performance hit). GPUs also have poor performance for dynamic branching -- sure they support it, but if all the pixels for a subset (i.e. the whole bank of little GPU cores processing a fragment) don't take the same branches, your performance is hosed. The bank size is usually 8 or 16 cores working on a rectilinear fragment of adjacent pixels.
Intel's approach is not very GPU-like at all. Indeed, it's more similar to the CELL SPU's (internal ringbuffer included) but instead of using DMA to access memory and individual memory work areas, it has direct access to memory (with hardware prefetching) and a large shared L2.
My bad - when something is this irrational, I guess the first suspicion should be politics - instead, I had simply assumed incompetence [or insouciance or absence of inquisitiveness] on the part of the author.
I will work to up my cynicism.
I wonder what a 486 core would perform like on a modern fab process. After all that chip had a modest I/D cache, single instruction/cycle performance for many instructions, and integrated floating point - all with a tiny transistor budget by modern standards.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
In order to get a product well hyped, sometimes it is best to keep the product secret, until it is close enough to being released, that it can assuredly be released before the hype has died down. This is something Apple seems to understand.
I'm more of a 'highend' graphics coder (read: not games). Lets say we want to do some complex soft body animation. We need to be able to access a coherant data structure that represents the entire geometry mesh to be able to do that. You can't do that on the GPU - triangles and vertices are all you get.
Lets say we want to use the numerous deformation techniques that do not work by transforming normals via a matrix (i.e. they must re-compute the normals because the deformation is non linear - FFD's for example). You have only been able to do that on the GPU since the geforce 8800 came out (it requires geometry shaders) - but even then isn't perfect (can't obey soft/hadr edges, and a few other things I'd like to do).
Lets say you want to start doing Global Illumination rendering (of which Ray tracing is one technique - but not the only, or most useful one), you'll need to access a scene database - which you can't do on a GPU.
I've never really liked the design of the GPU's we currently have (though i realise why they've evolved in the way they have) - It all feels like nasty hack after nasty hack (which changes with every new Geforce card). I've always wished that instead of a GPU we just had an add in highly vectorised CPU which we can use for anything. There are literally hundreds of things i can see this being useful for.
And once you discover what kind of driver support they offer, you go right back to Intel.
The new Intel G45 chipset recently made me order a new motherboard just to replace my video card. It's "fast enough", one might say...
Personally, I can't wait to get all that proprietary crap out of my kernel. Shouldn't have fallen for the temptation in the first place.
My Sig: SEGV
If Intel is doing that, the company isn't doing a good job. Instead it is getting publicity from ArsTechnica about how it is communicating in a confused way.
They've stated that it will be a 150W+ chip on a PCI Express 2 card, as I recall, and is intended as a GPU, though it will be fully programmable and have CPU capability (so when not doing GPU stuff, it could serve as extra CPUs). It is intended to compete in the high end graphics market.
Essentially, it's a clutch of high performance software vector units in parallel with a bunch of CPUs. Graphics scale with each added processor because it is a software driven architecture, whereas traditional GPUs don't scale because they have a fixed function pipeline (if everything were written for shaders, I would think it would scale). One of the things Intel is touting is Binned rendering (aka chunked or tile rendering), which is breaking the frame into tiles and storing a list of front-to-back polygons in off-chip memory and the tile buffer is scaled to cache. Technically, this should be no faster than z-buffering, but I believe they're sorting and ray casting and in a brute-force sort of way this is faster than z-buffering. What I don't get here is how they get "2-7x the performance" because they have the extra sort step.
By the way, if you look at CPUs, Intel's Core2 line has five power designations:
X - Extreme - power > 75W
E - Standard Desktop 55-75W
T - Standard Mobile 25-55W
L - Low Voltage 15-25W (their name - they mean low power)
U - Ultra Low Voltage - Power < 15W
According to Wikipedia the mini uses mobile processors (the T designation). Max power consumption of most laptops is 80W, so it is likely your mini maxes at 80W.
It's true that by the time C became popular, there already existed many libraries in Fortran that had been tested and debugged, so no new development on those lines was needed. But most new scientific software is being written in C. There are a few diehards who will never admit it, but for all practical purposes Fortran is dead as a development language.
More good analysis here:
http://www.pcper.com/article.php?aid=602
We've had a lot of problems with Intel graphics software. You are correct, however, we haven't tested the latest offerings from Intel. We felt so abused by the previous chipsets that we have had no desire to test the new software.
The last video driver we tested was version 14311 for the 945 chipset. It had a LOT of problems. There was a LOT of denial by Intel that there were problems.
So, I would be very interested to know: Is the video in the 965 chipset better? Is the software trouble-free? How about rotated vertically on a 1920 x 1200 monitor?
oops - I meant limited ray tracing (not casting). That or painter's algorithm (which optimally uses a sorted list). I never did fully understand ray casting - I jumped from painter's to hardware.
It isn't nearly as big as a more economical (both in terms of power usage and cost) market, but it is still large. ATi and nVidia are not having problems moving their new high end accelerators that draw hundreds of watts. On the contrary, there were some low stocks of the new GeForces after the prices were dropped.
See: x87 FPUs, cryptographic accelerators, video decoding, GPUs, etc.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
You may be a "'highend' graphics coder" but you seem to be utterly ignorant of modern GPGPU programming. Might want to fix that. (Forget about shaders...)
Intel marketing department hasn't handled the introduction very well, in my opinion.