ARM Unveils One-chip SMP Multiprocessor Core
An anonymous reader writes "ARM Ltd. will unveil a unique multi-processor core technology, capable of up to 4-way cache coherent symmetric multi-processing (SMP) running Linux, this week at the Embedded Processor Forum in San Jose, Calif.. The "synthesizable multiprocessor" core -- a first for ARM -- is the result of a partnership with NEC Electronics announced last October, and is based on ARM's ARMv6 architecture.
ARM says its new "MPCore" multiprocessor core can be configured to contain between one and four processors delivering up to 2600 Dhrystone MIPS of aggregate performance, based on clock rates between 335 and 550 MHz."
la LA la
in your face!
slashdot = teh spoke!
forgot to log in :)
you're winner!!
Please try to keep posts on topic.
Looks like here we are pointing at server technology.
How long before we have a 64/32/16 bit vatiable word size Thumb like architecture?
And if you thought that was boring you obviously havn't read my Journal ;-)
I heard that all the big companies are switching to LINUX. There is no way they will be using something as out of date as ARM.
..... .....
:)
What do you want, a cookie?
Seriously though, this would be great to run Linux on... Like a new Zaurus perhaps
I've got more mod points and GMail invi
The MPCore multiprocessor enables system designers to view the core as a single "uniprocessor", simplifying development and reducing time-to-market, according to ARM.
The opposite of HyperThreading? 4 CPU's to one instead of 1 CPU to 2?
The only thing that I can guess they mean by simplifying is that a developer would not have to design a multi-threaded application to take advantage of the other threads.
(and some say I don't) but this article looks like Alphabet Soup! with all the acronyms and all. Very Interesting topic - not for the Noob.
- Your stupidity got you into this mess, why can't it get you out? -Will Rogers
After finding out that Linux does NOT and has NO functionality to speak of. I purchased XP-Pro upgrade. To sum it all up, one word - AWESOME! Do yourself a favor and try this, it is plug and play to the extreme, is fast booting up and tons of applications available. Also purchased was Office 2003 Professional, the email client OUTLOOK is the best to date.
Overall, you cannot go wrong with Windows XP-Pro.
Why do I care about a multiprocessor core?
The _ONLY_ reason to do this is as a last resort when you can no longer clock your existing core any higher.
Give me a fast single core CPU anyday. When and only when that single core is maxed should you present to me a dual core. Any dual core CPUs not running at maximum core clock is a waste and inefficient IMHO.
In case you were wondering what that is all about...
Synthesis of a core is analagous to compiling your software- except in an FPGA it is processing a hardware definition language like VHDL or Verilog to create the 'code' used to load the FPGA.
This is a big plus for people wanting to put a wicked fast processing unit in the core along with whatever custom IO goodies they can come up with.
Too bad its not open source, as there are other wicked fast processor cores available. For example Xilinx can license you to put a PowerPC in its FPGA cores.
This site has more much technical information for my inner geek.
This ARM looks really awesome, it's nice to see them on the cutting edge.
Imo this new "multiple cpu's per chip" is the way forward. And the huge power savings is an added bonus. One question springs to mind though, how much performance can you gain by using this technique? i mean, sooner or later you will hit the limits of say, the memory bus or the graphics bus or whatever(speaking in layman's terms obviously), especially in environments where power consumption is an issue, and huge memory banks take alot of power to keep them refreshed. Still, i welcome the development, smp type deals can make a computing experience easier to cope with during intensive use like compiling and other cpu intensive tasks.
Will wank off Linus Torvalds for fame.
...how to make my Grendel Cluster!
I have been a user for about 10 years. This ends Feb 2014. The site's been ruined. I'm off. Dice, FU
Locutus of Microsoft? Is that you? Has the ambassador of the assimilated arrived among us to dispense sage marketing hype?
Don't blame Durga. I voted for Centauri.
When was the last time you saw one of us admit that they had no idea what they were saying?
Don't blame Durga. I voted for Centauri.
But what are some uses for this.If im not mistaken this is a 32 bit architecture so it has it's limits when it comes to scaling and its not powerfull inogh for one of those supercomps so whats is the target market?
Cobalt servers were originally based on ARM processors, and were for the most part really nifty. Most palmtop and cell devices also use the processors, so my question is, why don't we see more reasonable personal computers (or blades servers) based upon this architecture. People don't use the processing capacity available to them, and tuning of storage and networking often gives a better return per dollar. Somthing along the profile of the Psion Netbook or old (or new depending upon your perspective) Apple Newton (also ARM) would be very cool and useful. Give it some cellular/WiFi tech...
Exactly what I was looking for! Finally a comuter capable of letting me balance my checkbook, use a word processor, watch a video, and browse the web!
Is any one else getting the impression that our entire industry is driven by penis envy?
"It's bigger, it's faster, stronger! More Power!" About the only flaw in my theory is the continuing trend of decreasing computer sizes. But I can atribute that to the fact that it lets people put them in their pockets.
BTW: If you actully use your CPU(s), this doesn't apply to you. Your penis is bigger.
I would rather be ashes than dust!
Incorrect.
As the subject line says, I've been running SMP desktop PCs for years. My current home PC is a dual 1GHz P-III, my wife's is a dual 850 and my Linux web/file/mail/whatever server is a dual 700 with a 12% overclock.
You can only figure on about a 40% performance increase with a dual processor desktop PC, but being able to play Quake and burn a DVD at the same time has it's advantages ;-)
As others have mentioned, multitasking is greatly enhanced - and two midrange processors are generally cheaper than one high-end processor.
Also, even though some applications aren't multithreaded, all modern desktop OS are - so you get a performance boost even running single-task applications. If you're into running Windows, Internet Explorer is multithreaded, as are all Microsoft Office applications. There's a real-world productivity boost using SMP machines.
we see things not as as they are, but as we are.
-- anais nin
Just the other day I was thinking about "Massively Multiprocessor" ARM computer. It came to me after reading about cluster of VIA low-power computers.
;)
So, ARM are even lower power, they are designed quite correctly from the ground up[1] and the only thing that's missing is FPU. But the computer with 100 ARM CPUs would run faster than any ix86 today and probably would consume less power than the latest P4/K7/K8.
Give me for 64 proc (*4 cores per proc, so 256 proc) Linux machine anytime
Robert
[1] Anyone who knows internals of today ix86 processor from any vendor knows what a mess is it in order to use today's technology with ancient ISA like ix86.
Bastard Operator From 193.219.28.162
Let's talk some real numbers.
How will it fare against, say a Xeon with HT or 2 Opterons?
How will it stack up in price?
Chip Multiprocessors!! Another headache for programmers. check out this www.cradle.com
If I recall correctly, chips prior to ARM6 had register 15 (ARM's PC) designed with the upper six bits reserved for status. Having a program address space of only 2^26 = 64 MB was a major obstacle, even for (successors of) Acorn's RiscPC, a desktop model. With that resolved in the ARM6 series, it is still unable to look beyond the 4GB boundary. In the 4 way SMP servermarket this is likely to become a major pain.
So either they found a nice way to add yet more MIPS per megaherz (or per watt) to serve a higher end embedded systems or they're targetting (very) low end servers.
PMC-Sierra's MIPS-based RM9000x2GL's are really neat. It's been out for some months now. I'd love to see a machine with several dozen of these.
One is a ~1990 era version of the ARMv3 architecture (IIRC).
The other is ARM's latest version of the ARM architecture.
26-bit addressing limitations were removed ~14 years ago. I don't even think any of the more recent versions of the ARM architecture support it.
This is one of the reasons why Linux will eventually win in the handheld/cell phone space. Unlike WinCE, Symbian and PalmOS, Linux already supports SMP. Linux is light years ahead of WinCE, Symbian and PalmOS on all all key core technology features such as SMP. I know for a fact that Linux is being used to validate these features on future ARM processors. So, companies that based their products on Linux won't have to worry about the OS running on the new processors. The proprietary OSes will be playing catchup forever. I will not be surprised if Microsoft has to redesign WinCE from scratch yet again to accommodate SMP.
You don't use an opteron in the same situation as an arc core. Its a synthesisable mini processor used for controlling real time systems. It can be embedded in chips with custom VLSI logic to provide a platform for an operating system. Its not meant for competing with Opterons or any of the other such stupid ideas.
Why 4 cores?
Not all customers need 4 cores, some only need 1 (washing machines) or maybe 2. The system is therefore scalable to die size/power/cost requirements. Note its configurable, it does not have to have 4 cores. If I were a customer of arc I could chose how much die space to devote to the core and how much power I really needed.
4 cores, instead of one bigger more complex one is easier to engineer and get right. Look at modern graphics architectures, its the same principle (though one can argue about cache coherency).
Multiple cores would make dynamic power management much easier to handle I imagine. An entire core could shut down when its process(es) are not busy. A properly designed embedded system could benefit enourmously from this power saving and the hardware design is made relatively easy rather than trying to cut voltage for on one large core.
Embedded systems using arc cores often need to meet real time needs. One advantage of a multicore system would be to place a critical software component on a single core and, with correct use of memory, guarantee a fixed throughput rate of data. Of course I can use thread priorities but this makes things harder IMO. Maybe thats what they refer to by easier programming.
To me, this looks like a clean idea, which although not revolutionary in terms of an idea, does provide significant advantages for embedded device designers by being synthesisable.
Wroceng
(no association with ARM at all but I forgot my password temporarily)
Reading the article is not required; just skimming it reveals a diagram with 4 CPU's, each with its own cache connected by arrows to a large blob called "Snoop Control Unit".
I would imagine that a wristwatch that can do voice processing and movie rendering.
This would seem to hand in hand with the current thinking on on the fly OCR/language translation. I watched a show last night about a camera and PDA gizmo that could translate a road sign for you. I think that one did it via a server based imageing system. But if you do all that internal the posiblilites are endless, and hopefully not trivial, like SMP pong or really fancy ringtones.
low electical power + high CPU power == quick results and small size that does not require a radio flyer full of batteries.
ARM SUCKS!
I already did some stuff using ARM7TDMI and I can say that it SUCKS BIG TIME.
Why? NO INTEGER DIVISION. You have a blazing fast code 90% of the time and the other 10% it's crunching the single division in your program
how long until
Good lord, I knew it was an ARM ploy all along! /game geek
I've been an ARM fan for many, many years, so it's great to see this development. I've always thought this kind of thing should happen with ARM chips, and that the ARM should be well suited for this kind of application.
ARM cores have a great advantage of having an incredibly low transistor count. As a result the simpler ARM chips tend to have incredibly good production yields. I don't know if that's true for the more complex ARM variants like XScale. This multi-core processor should also be an order of magnitude less complicated than a Pentium, so it too should get good yields and thus for volume production be very cheap.
However it's also always struck me that the low transistor count of ARM chips could be of use in very high performance computing applications. It is difficult to build high transistor count chips in exotic materials, but an ARM-based chip needn't have those problems. This is of course why most chips are still made on silicon.
Also the low transistor count means that even in high speed situations you shouldn't have the clock-skew problems that plague larger processors. (Clock-skew is the problem whereby it takes longer than a single clock tick for a signal to reach from one side of the processor to the other.) A good proportion of the transistors in Pentium IVs and PowerPC G5s are there to deal with that very issue.
Are you sure it's the 1st time ARM has produced a synthesizable core? (despite what the article says)
A little over a month ago I sat through a presentation by one of the guys near the top of ARM's research division...
It was a general overview of ARM's business model (it's an IP company) and products followed by some other material. During the presentation some cores were marked as synthesizable, others were marked as the opposite (I forget the specific term that was used).
To the best of my knowledge all the cores reviewed in the presentation were already released and in production.
2600 MIPS is just a bit less than PIII 1GHz. :)
Would be nice to have this power in a PDA
My home PC also costs almost two orders of magnitude less than a PDP-1 did, even ignoring inflation.
John Sauter, greybeard (J_Sauter@Empire.Net)
I feel that experience with ARM based embedded system will be a good item on an EE student's CV. I wonder what's the most cost effective platform that I should get if I want to play with it?
Forgive me, but I thought that:
... I guess I dreamed the whole thing up ?
1) Intel had bought Arm
2) The Intel PXA was actually a renamed arm chip
--- "I didn't think anyone would understand it" -Prof. Bob Muller
One thing I've always wanted is a comparison of the general efficiencies of different processors. That is, if you made different types of processors the same clock speed, gave them equivalent caches, and ran a benchmark entirely out of cache, how would they all compare?
X86s are supposedly awfully inefficient architectures, so would they come out on bottom? Where would various ARM, xScale, 68k, and PPC processors end up?
Although x86 CPUs have scaled up to some amazing clock frequencies, it seems like their growth has slowed. Intel seems to have implicitely acknowledged this since they're dropping the P4 line for an updated P3 architecture. AMD did the same thing with the Athlon64s, which have slower clock speeds but are faster in the end.
If it turned out that an ARM at, say, 600 MHz turned out to be as fast as a P3 at 1 GHz, then I would say the ARM could leave the embedded market and could become competition in the desktop market. If such systems were significantly cheaper, cooler, smaller, and less power hungry than similar x86 systems, I think they could seriously compete.
As soon as you stated that, I thought, RTFA... But there wasn't one! So, I just said duh!
Yep! MIPS... But, Acorn, Now those are pretty nifty also.
You can also have fun with series expansions and other tricks for turning complex time consuming operations into faster, good-enough variations. It all depends on what matters to you - speed or accuracy.
Cheers,
Toby Haynes
Anything I post is strictly my own thoughts and doesn't necessarily have anything to do with the opinions of IBM.
Doesn't 100Mb flash using 180M transistors work out to 1.8 transistors/byte? I'm still just a student, but according to my intro to ECE class, even storing one bit takes more than 1.8 transistors...
Most modern flash memory uses multi-level storage, allowing several bits per cell (I'd known about 4 levels (2 bits), another poster mentioned 8 levels (3 bits)). Storage still only requires one transistor.
The way it works is that you have a FET with a floating gate. In "write" mode, you apply a high voltage to the non-floating gate to drive charge either in or out through the thin oxide layer separating the gate and the body. The charge on the floating gate (which is between the sense gate and the body) ends up effectively changing the transistor threshold voltage of the transistor. When the transistor is turned on by the sense gate, you get an amount of current that varies depending on the amount of charge on the floating gate.
Other types of flash memory exist. This is just one of the more common ones.
As for storing single bits, the standard SRAM cell has 6 FETs (two inverters, cross-coupled, and two readout FETs to connect the inverter outputs to a differential read/write bus). A DRAM cell, however, just has one transistor, which connects the read/write bus to a storage capacitor. Among other things, this means that DRAM reads are destructive (capacitor is discharged on to the read/write bus; this disturbance is amplified, driving the bus back to the rail voltage and re-charging the cell's storage capacitor).
You can also use the same tools to put the core into an ASIC.
Another good thing about synthesizable is that you can compile it to different specs. For example, the ARM7TDMI-S (S for synthesizable) can be compiled with different instruction decode sections. You can choose a small (cheap) and slow decode or a large (expensive) and fast one. So you can pick the best one for your situation. On most cores you can also select the amount of L1 cache you want (ARM7 doesn't have a cache at all, so it is exempt). Cache is one of the largest users of die space, so being able to size it also helps you keep costs down.