California Researchers Build The World's First 1,000-Processor Chip (ucdavis.edu)
An anonymous reader quotes a report from the University of California, Davis about the world's first microchip with 1,000 independent programmable processors: The 1,000 processors can execute 115 billion instructions per second while dissipating only 0.7 Watts, low enough to be powered by a single AA battery...more than 100 times more efficiently than a modern laptop processor... The energy-efficient "KiloCore" chip has a maximum computation rate of 1.78 trillion instructions per second and contains 621 million transistors.
Programs get split across many processors (each running independently as needed with an average maximum clock frequency of 1.78 gigahertz), "and they transfer data directly to each other rather than using a pooled memory area that can become a bottleneck for data." Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
Programs get split across many processors (each running independently as needed with an average maximum clock frequency of 1.78 gigahertz), "and they transfer data directly to each other rather than using a pooled memory area that can become a bottleneck for data." Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
Can this chip run GNU/systemd/Linux?
The press release does not include it, nor does the slashdot summary. The link to the paper: http://vcl.ece.ucdavis.edu/pub...
Maybe things are getting better. Too many programs are single threaded. Too many drivers are single threaded. Yes you can sandbox them.
That leaves out the nasty deadly embrace. Or less nasty, waiting on a key resource to complete.
More core just gets you bound up in your shorts faster.
more cores is not a magic bullet.
What games does this come with
A young intern who likes to "work late" in Davis California has recently come into the possession of a rather large stash of bitcoins.
Yay this is so awesome that researchers have pretty much put 1,000 6502 processors on a single chip. Way to go, maybe in a year we can move on to the equivalent of 1,000 z80 processors on that chip. Yay research!
That is the way of their kind.
But I am not sure what system or software can take advantage of it. Personally I want to see progress being made on quantum computing for consumer lever stuff.
the world's first microchip with 1,000 independent programmable processors ... Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
Yeah, but you have to keep in mind how many cores will be left for the user!
1000 cores minus:
* 200 cores for anti-virus software
* 25 cores for the ransomware battling it out with the anti-virus
* 55 cores for Microsoft's Win10 update nagware
* 350 cores for the NSA monitoring
* 122 cores for the FBI monitoring
* 75 cores to handle syncing all your data to the cloud
* 94 cores to run the 3D GUI based desktop
* 62 cores for constant advertising
* 14 cores for Google to keep tabs on what you're doing
* 1 core dedicated to emacs
So, only 2 cores left for the user. No better than an Athlon from 2005, I'm afraid.
...but we are gonna use it mostly for porn.
That is how they be.
Core temp might struggle with this one.
I'll take the lot...plenty of new apps that could do with the power...
If this is reality, how long will your keys need to be to have any protection by encryption?
Imagine a Beowulf cluster of these!
"BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
Not a Logical Effort, for sure.
6 years ago at least there was a 1000 core processor made. I don't see how this is different.
The older article:
http://www.pcworld.com/article/215113/1000_Core_Processor_Eats_Quad_Core_CPUs_For_Lunch.html
They want to kill us all.
It could be an interesting extra chip in a general use computer, where programs could syphon routines to, for example kinds of video/image rendering, parallel-able mathematical operations, image recognition, a 1000 node neural network, etc.
Deducing from its primary applications which is decoding/encoding and encryption it seems it is more similar to digital signal processor rather than to a regular cpu.
Perfect for the internet of things. Now rather than just an egg timer I can have a battery power super computer in my salt shaker that does a finite element simulation of the egg in boiling water, going beep and the perfect moment. The toaster will be able to insult me in the kings english or the emporer's mandarin.
And orange Pi is planning to make a board with one of these that only runs on one of the 1000 cores, and no stable OS.
This thing is I suspect suited for programs that parallelize and have little interprocess communication and run in small memory. Why do I know this? because if each processor had a large memory and an infiniband backplane it would melt. Thus you could update your facebook status on all 1000 dummy accounts for example. Or compute pi or chug bitcoins.
Some drink at the fountain of knowledge. Others just gargle.
Sounds like https://en.wikipedia.org/wiki/Connection_Machine
The 1990's called, they want their joke back!
*** Suerte a todos y Feliz dia!
NOP
Aren't the shader units of the modern GPUs like the Geforces basically specialized CPUs?
In this case we're already at 2560 CPUs on a single chip.
The way to improve computational technology is parallelism. What are the usage domains?
-anything video related
--games
--image recognition
-anything AI (I think?)
--autonomous cars
--facial recognition
-a lot of physics applications
Thoughts?
PS: I don't reply to ACs.
The major difference is that this architecture has register to register communications between cores instead of shared L2 cache. Much, much faster.
and for a song and dance.
A GPU, or rather multi parallel processor by any other name since GPUs have become programmable enough to do most anything you'd want, is still a GPU. And that's all they've built, a crappy GPU. Can't wait to see what kind of programs can't run on a chip with no memory coherence, the amount of thrashing a 1,000 unscyned cores could come up with as they all wait for latency bound memory access they need again and again and again. I wonder what the % of cores doing actual work over time is, 10%? 5? Less?
When whoever builds this passes computer engineering 102 or whatever and learns what they've done then maybe they can get a job at Qualcomm or something, after they've graduated in 6 more years and actually know something.
It only runs at 1.7ghz. My Pentium IV running XP runs at 4 GHz! Just ask any Joe Six pack who bought them over an AMD?
http://saveie6.com/
...contains 621 million transistors... Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
Let see... 1,000 very small compute cores... sounds a awful lot like your typical GP-GPU these days. Only reason the power consumption is so small is because it has < 1 billion transistors. Compare that to the 17 billion transistor nVidia pascal monster. Even the non-Iris graphics Skylake desktop CPU has ~1.7 billion, and over half of those are spent on the GPU.
Chances are even paltry Intel HD Graphics running an OpenCL program will have more FLOPS than this thing. Don't be fooled by the flashy headline, the laws of physics still apply.
It's a 32 x 31 grid = 992, plus 8 extra stuck on one edge to make up the numbers.
Think like quantum mechanic, finite elements analyzes or weather prediction etc... Everything which are based on matrix or subset of elements which are calculated in parallel. Although I am guessing here memory would be a bottleneck.
C. Sagan : A demon haunted world:
http://www.amazon.com/gp/product/0345409469/
visit randi.org
All I want for xmas is a new cpu without a built in NSA ME backdoor
Ah. then the "The World's First 1,000-Processor Chip" hype is fully justified!
Processors without register-to-register communications just don't count!
Sounds exactly like a GPU to me. :-P
Computer simulation made easy -- LibGeoDecomp
This sounds like a warming-up of the old idea of the transputer (https://en.wikipedia.org/wiki/Transputer#System_on_a_chip).
Will slow it down to a crawl before blue screening. Then we'll be ready for Windows 24 Home Premium Edition. No worries.
On y va, qui mal y pense!
I want ass to mouth communication!!
Each CPU supplies an amount of computation less then a single instruction on a regular CPU. Think of it as a grid of instructions not a grid of computers. A processor has a Harvard architecture with 128 instructions of 40 bit size and a separate data memory with two banks of 128 16 bit data values (256 16 bit data words total). It says nothing about register files or stacks or subroutine calls. It's likely that the two data banks are in effect the register set. The paper implies that a CPU can compute a single floating point operation in software.
Compiling means mapping code fragments to a set of connected CPUs and routing resources, and then feeding the data into the compute array. After some circuitous path through the grid the answer emerges somewhere. There are also 12 independent memory banks each with a 64KB of SRAM that are available to all CPUs.
History has not been kind to this kind of grid architecture with lots of CPUs and very little memory. Almost none of them ever made it out of the lab. It's symptomatic of hardware engineers who are clueless about software and design unprogrammable computers. They confuse aggregate theoretical throughput with useful compute resources.
Debugging code on this would be a nightmare. It's completely asynchronous, there is no hardware to segregate different sets of CPUs doing different computing tasks and so few resources per CPU that software debugging aids would crowd out the working code. The people listed on the paper should be punished by being force to make it do useful work for at least a year. They would be scarred for life.
Why is Snark Required?
Like this one form 2004?
https://tech.slashdot.org/story/11/01/03/1722240/researchers-claim-1000-core-chip-created
Even ignoring all other limitations of this particular processor there's still Amdahl's law, limiting the speedup by the serial parts of a task.
As one example how that works look at compiling to hardware. In theory this should bring enormous benefits as not only can one parallelize on a instruction level but on a sub-instruction one, speculating and pipelining e.g. additions. Many types of communication can be eliminated entirely by replicating hardware.
But even with those benefits there are a _lot_ of software that is better to run on a standard processor. Why? Because using custom optimized hardware to run it ends up replicating a number of normal processors including caches, branch prediction etc. and then a processor optimized by a dedicated team of experienced people ends up being attractive.
Now saying custom hardware can't bring huge benefits, not even saying that this research processor can't do it _however_ in general there are a lot of tasks that can't really be accelerated much.
So has it been tested on bitcoin mining yet? or seti project? Im curious what sort of real world throughput it has...
That is because serious computer engineers do not use Ubuntu
https://tech.slashdot.org/stor... ...
Anyone knows if there was/were any further development for that project?
If that project continues to develop for the 12 long years in between 2004 to now the result could have been awesome - particularly coupled with the upcoming 10nm node that Samsung / TSMC / Globalfoundries are busy developing
It's only 0.7W when clocked at 115MHz, but still impressive.
1000 interconnected processors? Wow.
Scale that up by a factor 100 million, and you have a human brain.
Something that will run Flash without bogging down.
Do not look at laser with remaining good eye.
Imagine a Beowulf cluster or these !
What kind of computer scientists are they?
They should have made it 1024. And labelled them 0-1023.
Systemd? Probably because serious computer engineers don't have any trouble dealing with the irritation that systemd causes.
Confirming: our latest nodes on our cluster are running CentOS7 which is systemd powered.
(And hopefully the final practical product out this buzzword-compliant pressrelease would still be somewhat useful.
We could have some special workloads to apply it to).
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Actually, not as many things as you might think. Ahmdahl's law says that the speedup is proportional to the fraction of the algorithm that can be parallelized. For some operations, the task parallelize well (hence the interest in GPU computing for some tasks). For other tasks every step requires the result of a previous step and the effect of adding more CPU's is minimal. Web browsers fall in this last category - you can't do layout until you have some data off the network and even then, the layout mostly proceeds top to bottom of the page: you can't really finish the layout until you know how big all your images, etc. are. So yes, sure you can benefit by adding a few processors, but very quickly you hit a point of diminishing returns.
-JS
Its depends.
In the case of Xeon-Phi (i.e.: ex-Larrabee GPUs repurposed as parallel processing units), in addition to the very wide SIMD AVX512 units, there are also scalar cores able to run pentium-compatible binaries.
So the Linux core managing all the hardware actually run *on* the GPU itself (and you can SSH into your Xeon-Phi if you want).
On the other hand, the Tilera works exactly as you describe.
A weird many-core structure running the processing kernels,
and a nearby classical risc core managing the whole.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Had to say it. Haven't see that response in a while.
Peace is easy to achieve, just surrender. Liberty is much harder get/keep.
I'm nitpicking to hell with this but...
Yes, all the *SIMD units attached to 1 execution core* will necessarily process the exact same instruction at the same time on the same cycle... ...but there more than 1 execution core on most higher range GPUs, and nearly all modern GPUs are able to keep several hyperthreads running concurrently to hide latencies.
(which from a design point of view makes entirely sens: graphical processing is about repeating some processing on thousands or million pixels. Better group them in batches instead of processing every last damn pixels individually)
So a modern GPU can execute several different instruction at the same time.
Even if usually it's the same exact OpenCL code uploaded to all units, the various SIMD units could be executing different points of code.
But yeah, you're right, within a SIMD, all the threads run the same instruction.
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
So this is basically like GreenArrays only less powerful CPU's, is clocked, and is unavailable for purchase.
http://www.greenarraychips.com...
Nit-picking to hell...
You've forgotten a special use case:
Yes, if AC's code does something stupid like "every even thread branch lest, every odd thread branch right", the execution group will need to run the code twice, with altening masks to run each branch, exactly as you describe.
But if it's entirely different part of the thread block that diverge (e.g.: first half vs. second half), the "executions groups" will each diverge independently. The first 18 taking one branch and the second taking the other branch. With no time lost due to alterning execution masks.
(Which is the preferable way to handle branching code in parallel environment. If you can't do away with the branches altogether, at least try to organise it so nearby threads on the same SIMD branch/loop together.
e.g.: bin-sort your loops by similar lengths together)
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Why is this modded down? Editordavid is a fucking idiot and the fact that he is still working for slashdot says something about the management.
He has to do one thing, and that's post articles other people have already written. That means all he has to do is come up with introductory text and copy and paste the rest. I mean for fuck sakes.
TLDR: whipslash keeps him around because he sucks a mean dick.
well, fsck me!.
Well, fsck is also going to be handled by systemd! Systemd is cancer!!!
No, wait, you're running the whole on top of BTRFS which doesn't have a real-fsck because it doesn't make any sens on copy-on-write systems! BTRFS is the cheap knock-off of ZFS!!!!
Argh! All these meme start to get confusing, I don't know which I currently need to blame!
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
Sorry, can't "+1 Funny" you, cause I've already posted in this thread...
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]