Posted by
CmdrTaco
on from the coming-to-a-pc-near-you dept.
Scrooge919 writes "An article on ZDNet discusses AMD's plan for the successor to Opteron -- the K9. The biggest feature will be that it contains multiple cores. The K9 is currently slated for the second half of 2005, which would be less than 3 years after the Opteron shipped."
Re:Might As Well Get It Over With...
by
eln
·
· Score: 2
Personally, I'm withholding judgement until I hear what Jim Belushi thinks about the product.
Multiple cores?
by
adaknight
·
· Score: 3, Insightful
So, now we'll have multiple processing units and pipelines in each core, and multiple cores. The biggest question in my head is how much limitation there will be from memory bandwidth limitations. I just don't see how you can supply data and instructions fast enough to, say, three 3 GHz cores running on the same chip unless you have close to a thousand pins on the chip. The other question would be about cooling.:)
-- hrm. then again. maybe not.
AMD TimeLine to Reality Generator?
by
supremebob
·
· Score: 2, Insightful
Let's see... AMD missed the original launch date of their Barton core CPU's by at least 3 months, missed the launch date of the Opteron by over 6 months, and the original launch date of the Athlon 64 by almost a year.
If they're saying now that the chip will be 4Q 2005, when should we REALLY be expecting it to show up on store shelves? 3Q 2006? 1Q 2007, maybe?:)
This makes a lot of sense.
by
NerveGas
·
· Score: 2, Interesting
As the manufacturing process shrinks, and companies are able to put more transisters on a chip, the question arises: What should we use those extra transistors for?
Now, there are several options. They could come up with a new processer design, but that takes a tremendous amount of R&D. They could just put tons of cache on the chip, but that gives diminishing returns.
Or.... the Opterons already have very simply I/O mechanisms, namely, HyperTransport. Literally all they have to do is plop down two Opteron cores, connect the HyperTransport lines, and bam: Dual-core processer. I'm honestly surprised they're not doing it SOONER.
Of course, the lines for memory controllers and the like have to be drawn out to the pins on the packaging, but that's a piece of cake.
steve
-- Oh, you're not stuck, you're just unable to let go of the onion rings.
As compared to AIBO, which of course is a "fake" dog. But if they put a K9 Processor in the AIBO, we have a conflicted pet that is a real fake dog.
When shall we be free of the X86?
by
LWATCDR
·
· Score: 4, Interesting
Folks we really do not need to run DOS applications any more. If we do couldn't we emulate them. I just do not believe that the IAx86 is the best IA for the future. The idea that in 30 years we will be runing some mutant 128 bit X86 chip makes my skin crawl. I guess I miss the days when new ideas where the norm for microcomputers. Rember when there was the 32032, 68020, TM990, Zilog z8000, the 6502 family, and the 88000? . How about it Transmeta? Let's see a version of Linux that does not run on top of the the translation layer. Lets get some new ideas out there I am betting bored. Now that I said that, GO AMD. While it is still X86 this is one of the more interesting ideas I have seen for a while.
-- See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
They're doing what now?
by
PaschalNee
·
· Score: 3, Informative
For those of you who live mainly in the software world (myself included) there's a very good overview of all things CPU on Arstechnica. Detailed enough to be interesting but starts at a basic enough level.
And remember than nothing impresses the ladies more than sombody who knows why multiple cores might be interesting
Re:When will it end???
by
The+One+KEA
·
· Score: 2, Informative
It's only a drag when operating in x86 Legacy Mode on an AMD64-based core. When you're operating in x86-64 Compatibility Mode or x86-64 Long Mode, you get access to sixteen 64-bit registers. Here's a graphic which explains it quite nicely: http://www.devx.com/assets/amd/5929.gif
The rest of the article explains the concepts of the AMD64 architecture. Link: http://www.devx.com/amd/Article/16018
-- SCREW THE ADS! http://adblock.mozdev.org/
Proud user of teh Fox of Fire - Registered Linux User #289618
K9 huh? Do we need a ziplock baggy when it takes a core dump?
And a serious comment...
by
jd
·
· Score: 3, Interesting
"Multiple cores" is meaningless, with today's microprocessors. Typically, there will be multiple execution units for common instructions. Pipelining, pre-fetch and branch prediction all increase performance by more than can be obtained by using antiquated SMP-style approaches. It's far more important to distribute the bus load over time, as that is the larger bottleneck.
By having multiple register sets within a single core, and tagging requests/results, you can avoid the complexity of SMP entirely, while producing the effect of having multiple processors.
If you want to go further, improve the support for internal routing of operations. Thus, if you've instructions operating on the same data, the data can be directly sent from logic element to logic element. The entire chain could then be executed as a single instruction (albeit composite). This also eliminates the need to have a CISC-to-RISC layer in the processor, as complex instructions would be mapped by routing commands and not by multiple internal fetch/execute cycles.
By adding input/output FIFO queues to each instruction, where each node in the queue tagged the "virtual" processor associated with that instruction, the CPU would be limited in the number of CPUs it would look like only by the number of bits used in the tag. (eg: An 8-bit tag gives you 256 virtual CPUs on a single die.)
Why is this better than "true" SMP? Because 2 CPUs can't run a single thread faster than 1 CPU. Programs are generally written with single processor systems in mind, and therefore cannot run any better when the extra resources exist.
Sub-instruction parallelism allows you to run as fast as you can fetch the instructions. Because the parallelism is merely at the bookkeeping level, there's no overhead for extra threads.
Because the logic elements would pull off the queues, as and when they were free to do so, there's no task-switching latency.
Because the parallelism is sub-instruction, and not at the instruction block or thread level, more of the resources get used more of the time, thus increasing CPU utilization. It also means that tasks that aren't parallel at a coarse-grain can likely get some benefit, as there may well be parallelizations that can be done at the element level.
Because a single, larger die can carry with it more useful silicon than two or more seperate dies. (Which is likely why AMD are using multiple cores in their K9 CPU.)
AMD's approach is an improvement over the seperate CPU schema, but it's nowhere near the potential an element-cluster could provide. The parallism that can be gained is way too coarse-grain. It'll offer about the same level of improvement the move from seperate 386 and 387 chips to the 486DX did, for much the same reason. Reduced distances and reduced voltages allowed for faster clock rates on the same technology.
But engineering at the right level will always produce better results than cut-and-paste construction, even if it does require more thought.
-- It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
I know this has been said countless times, but...
by
Knights+who+say+'INT
·
· Score: 3, Informative
Why do I need -more- processing power?
I don't do any 3D rendering, but I believe I do more processor-heavy work than the average Carlos - sp. big numerical differential equation and bigbigbig linear optimization stuff in Maple - and my tienda-de-descuentos K6-II still crunches the stuff faster than I could ever desire.
The main problem with personal computers is that they use hard drivers for memory swap space when they should be using RAM memory to cache for hard drives.
If I could spend $500 on my computer right now I'd fill it with as much memory as the architecture allows. I'd then run a ramdrive and direct many of the computer activities to there.
I mean, when a webpage opens, a banner is downloaded to my hard drive. That's just irrational. And it prolly wears the hard drive's physical mechanism faster too.
But then again, we don't have a benchmark of ram speed, nor do we have hypemakers touting new, faster RAM. And prolly there's not too much activity in technologically improving RAM either.
It's about time.
Standing on the shoulders of giants.
Personally, I'm withholding judgement until I hear what Jim Belushi thinks about the product.
So, now we'll have multiple processing units and pipelines in each core, and multiple cores. The biggest question in my head is how much limitation there will be from memory bandwidth limitations. I just don't see how you can supply data and instructions fast enough to, say, three 3 GHz cores running on the same chip unless you have close to a thousand pins on the chip. The other question would be about cooling. :)
hrm. then again. maybe not.
Let's see... AMD missed the original launch date of their Barton core CPU's by at least 3 months, missed the launch date of the Opteron by over 6 months, and the original launch date of the Athlon 64 by almost a year.
:)
If they're saying now that the chip will be 4Q 2005, when should we REALLY be expecting it to show up on store shelves? 3Q 2006? 1Q 2007, maybe?
As the manufacturing process shrinks, and companies are able to put more transisters on a chip, the question arises: What should we use those extra transistors for?
Now, there are several options. They could come up with a new processer design, but that takes a tremendous amount of R&D. They could just put tons of cache on the chip, but that gives diminishing returns.
Or.... the Opterons already have very simply I/O mechanisms, namely, HyperTransport. Literally all they have to do is plop down two Opteron cores, connect the HyperTransport lines, and bam: Dual-core processer. I'm honestly surprised they're not doing it SOONER.
Of course, the lines for memory controllers and the like have to be drawn out to the pins on the packaging, but that's a piece of cake.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
New "K9" chip promises to be "a real dog".
As compared to AIBO, which of course is a "fake" dog. But if they put a K9 Processor in the AIBO, we have a conflicted pet that is a real fake dog.
Folks we really do not need to run DOS applications any more. If we do couldn't we emulate them. I just do not believe that the IAx86 is the best IA for the future. The idea that in 30 years we will be runing some mutant 128 bit X86 chip makes my skin crawl. I guess I miss the days when new ideas where the norm for microcomputers. Rember when there was the 32032, 68020, TM990, Zilog z8000, the 6502 family, and the 88000? . How about it Transmeta? Let's see a version of Linux that does not run on top of the the translation layer. Lets get some new ideas out there I am betting bored.
Now that I said that, GO AMD. While it is still X86 this is one of the more interesting ideas I have seen for a while.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
For those of you who live mainly in the software world (myself included) there's a very good overview of all things CPU on Arstechnica. Detailed enough to be interesting but starts at a basic enough level.
And remember than nothing impresses the ladies more than sombody who knows why multiple cores might be interesting
Seeing The Core once was enough for me. There was a reason it was such a dog at the box office.
Rank Presidents by th
It's only a drag when operating in x86 Legacy Mode on an AMD64-based core. When you're operating in x86-64 Compatibility Mode or x86-64 Long Mode, you get access to sixteen 64-bit registers. Here's a graphic which explains it quite nicely: http://www.devx.com/assets/amd/5929.gif
The rest of the article explains the concepts of the AMD64 architecture. Link: http://www.devx.com/amd/Article/16018
SCREW THE ADS! http://adblock.mozdev.org/ Proud user of teh Fox of Fire - Registered Linux User #289618
K9 huh? Do we need a ziplock baggy when it takes a core dump?
By having multiple register sets within a single core, and tagging requests/results, you can avoid the complexity of SMP entirely, while producing the effect of having multiple processors.
If you want to go further, improve the support for internal routing of operations. Thus, if you've instructions operating on the same data, the data can be directly sent from logic element to logic element. The entire chain could then be executed as a single instruction (albeit composite). This also eliminates the need to have a CISC-to-RISC layer in the processor, as complex instructions would be mapped by routing commands and not by multiple internal fetch/execute cycles.
By adding input/output FIFO queues to each instruction, where each node in the queue tagged the "virtual" processor associated with that instruction, the CPU would be limited in the number of CPUs it would look like only by the number of bits used in the tag. (eg: An 8-bit tag gives you 256 virtual CPUs on a single die.)
Why is this better than "true" SMP? Because 2 CPUs can't run a single thread faster than 1 CPU. Programs are generally written with single processor systems in mind, and therefore cannot run any better when the extra resources exist.
Sub-instruction parallelism allows you to run as fast as you can fetch the instructions. Because the parallelism is merely at the bookkeeping level, there's no overhead for extra threads.
Because the logic elements would pull off the queues, as and when they were free to do so, there's no task-switching latency.
Because the parallelism is sub-instruction, and not at the instruction block or thread level, more of the resources get used more of the time, thus increasing CPU utilization. It also means that tasks that aren't parallel at a coarse-grain can likely get some benefit, as there may well be parallelizations that can be done at the element level.
Because a single, larger die can carry with it more useful silicon than two or more seperate dies. (Which is likely why AMD are using multiple cores in their K9 CPU.)
AMD's approach is an improvement over the seperate CPU schema, but it's nowhere near the potential an element-cluster could provide. The parallism that can be gained is way too coarse-grain. It'll offer about the same level of improvement the move from seperate 386 and 387 chips to the 486DX did, for much the same reason. Reduced distances and reduced voltages allowed for faster clock rates on the same technology.
But engineering at the right level will always produce better results than cut-and-paste construction, even if it does require more thought.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Why do I need -more- processing power?
I don't do any 3D rendering, but I believe I do more processor-heavy work than the average Carlos - sp. big numerical differential equation and bigbigbig linear optimization stuff in Maple - and my tienda-de-descuentos K6-II still crunches the stuff faster than I could ever desire.
The main problem with personal computers is that they use hard drivers for memory swap space when they should be using RAM memory to cache for hard drives.
If I could spend $500 on my computer right now I'd fill it with as much memory as the architecture allows. I'd then run a ramdrive and direct many of the computer activities to there.
I mean, when a webpage opens, a banner is downloaded to my hard drive. That's just irrational. And it prolly wears the hard drive's physical mechanism faster too.
But then again, we don't have a benchmark of ram speed, nor do we have hypemakers touting new, faster RAM. And prolly there's not too much activity in technologically improving RAM either.