ARM Unveils One-chip SMP Multiprocessor Core
An anonymous reader writes "ARM Ltd. will unveil a unique multi-processor core technology, capable of up to 4-way cache coherent symmetric multi-processing (SMP) running Linux, this week at the Embedded Processor Forum in San Jose, Calif.. The "synthesizable multiprocessor" core -- a first for ARM -- is the result of a partnership with NEC Electronics announced last October, and is based on ARM's ARMv6 architecture.
ARM says its new "MPCore" multiprocessor core can be configured to contain between one and four processors delivering up to 2600 Dhrystone MIPS of aggregate performance, based on clock rates between 335 and 550 MHz."
(and some say I don't) but this article looks like Alphabet Soup! with all the acronyms and all. Very Interesting topic - not for the Noob.
- Your stupidity got you into this mess, why can't it get you out? -Will Rogers
Have you never heard of Multi threading?
On a WorkStation, I would agree with you, but on any server with thread optimised applications, more threads = more power...
Once again, People think WorkStation, for things not designed for the WorkStation market
go buy an intel celeron cpu then if MHZ is the only thing that matters..
arm cpu's being used mainly in devices with limited electrical power available anyways.. if this gets them more processing power per watt then all the better.
world was created 5 seconds before this post as it is.
A lower core clock can save you a lot... bot financial and in energy. Raising the clock rate on a chip will increase its energy usage exponentially.
If the problems you want to solve are parallel enough why not?
Jeroen
Secure messaging: http://quickmsg.vreeken.net/
I think the one thing that we're all waiting for is the introduction of on-chip system memory. Currently, the cache of a high-performance processor consumes more than half of the chip area because the penalty for a cache miss is so large. For decades now, memory frequency scaling has lagged that of the microprocessor. Although there has been some great strides recently, latency is still rearing its ugly head. External DRAM is too electrically distant to remain at the heart of any high-performance system.
Once we get processor and memory combined, we'll see performance increasing by several orders of magnatude. Processor architecture will matter even less, since emulation of *any* architecture will become trivial in terms of available processing speed. Your Thumb-like prediction will most certainly pan out to some magnatude.
Life is the leading cause of death in America.
Just the other day I was thinking about "Massively Multiprocessor" ARM computer. It came to me after reading about cluster of VIA low-power computers.
;)
So, ARM are even lower power, they are designed quite correctly from the ground up[1] and the only thing that's missing is FPU. But the computer with 100 ARM CPUs would run faster than any ix86 today and probably would consume less power than the latest P4/K7/K8.
Give me for 64 proc (*4 cores per proc, so 256 proc) Linux machine anytime
Robert
[1] Anyone who knows internals of today ix86 processor from any vendor knows what a mess is it in order to use today's technology with ancient ISA like ix86.
Bastard Operator From 193.219.28.162
Alas, they were not "PC-compatible" and at a certain time the Intel/AMD clones with Linux became much more attractive.
Are you talking Sharp Zaurus? I'm eyeing one (If I could order them in the Netherlands...)extern warranty;
main()
{
(void)warranty;
}
No doubt their are people who need this kinda raw power. Rendering a movie is a good example.
But 99% of the people out there (and 99% of the software) can't really take advantage of that kind of power.
But darned if they don't HAVE to have the latest thing on the market. Like spending 4 times as much for bleeding edge equipment will keep their computers from becoming "out-dated". And I'm not just talking early adopters, I'm talking GrandMothers and Young Nerdlings.
People pay outrageous amounts for equipment they will never use (no I'm not talking about home gyms).
I would rather be ashes than dust!
If you're comparing it to Xeons and Opterons, you're not even in the same market.
A lower core clock can save you a lot... bot financial and in energy. Raising the clock rate on a chip will increase its energy usage exponentially.
[Rant]Why, oh _why_, do people keep horribly abusing the word "exponentially"?[/Rant]
Power goes up in direct proportion to the clock rate. This is a "linear" relation. If it was really "exponential", we'd be stuck running 10 MHz processors because anything else would melt.
For the really pedantic, the way to compute dynamic power dissipation is to figure out how much capacitance you have on nodes that are being switched, what fraction of the time they're being switched, the amount of energy required to switch a given amount of capacitance (depends on signal voltage), and the frequency, and multiply these together:
P = (1/2) * Vdd^2 * C_node * N_nodes * transitions_per_clock * f
The only thing that's _not_ linear is the power-vs-voltage relation, and that's _quadratic_. Anything "exponential" sucks a whole lot worse.
[ObDisclaimer about clock feedthrough, but that's linear with frequency and capacitance to.]
One of the technologies you'll start seeing for high-performance embedded systems (and can find now, in a few places), is core pinouts designed as the mirror image of a standard DRAM memory pinout. With this setup, a CPU can be put on one side of a four (sometimes five) layer circuit board, normally, and a DRAM chip (single chip, so about 1Gb max for most usage; no double channel) can be put directly opposite it, with vias connecting the two. The electrical connection of the signalling wires between the two is extremely good, and allows much higher speed, lower latency memory to be used.
I've had this sig for three days.
Second, not every server needs a gigantic address space, but could still benefit from additional CPU power. A small but very active database server, for example. There are plenty of 32 bit SMP systems out in the real world doing real work right now, after all.
Finally, embedded systems are frequently still 8 bit, commonly 16 bit, and only recently has 32 bit become common as the more recent low-power designs have been released. Using a more powerful processor reduces your development time because you only have to write assembler for tightly timed and very short loops, and you can just throw high-level functions around.
Finally, a processor like this would be excellent for the console video game market, although it doesn't look like ARM is going to have a chance to supply it. However, a two-core version could easily end up in a handheld - people are always finding new ways to consume more and more CPU time on handheld devices. A handheld with IEEE1394 and high quality video output might be the ultimate multimedia device. And I personally dream of having a single device which fits in my pocket and performs the duties of a (basic) laptop, a cellular phone, and a PDA. If you had a little pocket projector, and it took 1394, plus it had one of those laser-scanning rangefinding keyboards in it, you could do some real work. I'm not sure about the pointing device though, maybe you could have a laser pointer that you aimed at the projected display and hit a button to change from red to green, which could be picked up by the device. I mean, it's going to be a camera phone too, right?
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Are you sure it's the 1st time ARM has produced a synthesizable core? (despite what the article says)
A little over a month ago I sat through a presentation by one of the guys near the top of ARM's research division...
It was a general overview of ARM's business model (it's an IP company) and products followed by some other material. During the presentation some cores were marked as synthesizable, others were marked as the opposite (I forget the specific term that was used).
To the best of my knowledge all the cores reviewed in the presentation were already released and in production.
If integer division is too slow, then don't do integer division. Obviously I have no idea how much experience you have, but I thought it was common knowledge that integer division is always slow, even when implemented in hardware. Divisions can take many cycles to complete, and during that time, depending on the architecture, you might not be able to perform any other instructions in parallel. The end result is that doing many divisions is going to kill the performance anyways, so there's little benefit to including it in the chip.
If you must do integer divisions, you can optimize the code a little. For example, if you're dividing by powers of two, use right shifts instead of divisions. But, if you can, it's better to avoid integer division on embedded systems to begin with.