California Researchers Build The World's First 1,000-Processor Chip (ucdavis.edu)
An anonymous reader quotes a report from the University of California, Davis about the world's first microchip with 1,000 independent programmable processors: The 1,000 processors can execute 115 billion instructions per second while dissipating only 0.7 Watts, low enough to be powered by a single AA battery...more than 100 times more efficiently than a modern laptop processor... The energy-efficient "KiloCore" chip has a maximum computation rate of 1.78 trillion instructions per second and contains 621 million transistors.
Programs get split across many processors (each running independently as needed with an average maximum clock frequency of 1.78 gigahertz), "and they transfer data directly to each other rather than using a pooled memory area that can become a bottleneck for data." Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
Programs get split across many processors (each running independently as needed with an average maximum clock frequency of 1.78 gigahertz), "and they transfer data directly to each other rather than using a pooled memory area that can become a bottleneck for data." Imagine how many mind-boggling things will become possible if this much processing power ultimately finds its way into new consumer technologies.
But I am not sure what system or software can take advantage of it. Personally I want to see progress being made on quantum computing for consumer lever stuff.
That's probably all it can run. Typically specially designed systems need the ability to configure the OS radically differently than has been done previously which requires source code. Microsoft provides source code, as does IBM, in some special situations, but mostly it tends to be Linux that is used first. Consider the reasoning behind the OS chosen for the fastest computers in the world.
Systemd? Probably because serious computer engineers don't have any trouble dealing with the irritation that systemd causes. (The rest of us may, but if you have enough smarts to handle building a specialized chip, then systemd isn't really a challenge.)
B) Eliminate all the stupid users. This is frowned upon by society.
I take it you've never done high performance computing, have you? More cores is often a good thing. If I'm doing a simulation across 1,024 cores and each node has 16 cores, that means I need a minimum of 64 nodes. There's a lot of communication that takes place over protocols like Infiniband in order to make MPI work. It also rules out the possibility of shared memory systems like OpenMP when jobs reach that scale and have to be spread across multiple nodes. If more cores are located within a single node, it reduces the amount of communication with other nodes and the resulting latency. It also makes shared memory a viable option for larger parallel jobs. If I can fit 64 or 256 cores on a node, there's a lot less need for relatively slow protocols like Infiniband to pass messages. I don't think the ordinary user has a need for 1,000 cores or would have such a need for a long time. But it really could help with high performance computing.
It could be an interesting extra chip in a general use computer, where programs could syphon routines to, for example kinds of video/image rendering, parallel-able mathematical operations, image recognition, a 1000 node neural network, etc.
Doing any sort of large-scale computational fluid dynamics or finite element simulations may require a great many cores. For example, you might want to conduct a very detailed simulation of the air flow around a vehicle, airplane, structure, etc. to have a basic understanding of its aerodynamics before spending time and money testing an actual prototype in a wind tunnel. You might also want to look at how very complicated, soft-body structures deform due to a variety of external stimuli. Such information would be crucial for certain materials science applications. Chemical reaction and acoustic simulations may also require a great deal of computing power, especially if you want to have a high spatio-temporal resolution.
Essentially, there are plenty of physical and theoretical science applications that can benefit from massive processing capabilities. There is a lot of fundamental science that is also performed in simulation before any actual tests occur.
A single AA is marketing lies. Sure, the battery will handle it, but it's not what it is made for and the energy you actually get out will be less than the marked one.
The number listed on the battery is typically how much you will get out of it from a 20h discharge time.
You should not be surprised if an AA battery only lasts for half the rated time if you try to suck 0.7W out of it.
The runtime won't be much above 1 hour.
OTOH a computer with 1000 processors is hardly made for portable applications so the single AA example is just silly anyway. For the application one would use this for there are better power sources available.
Even ignoring all other limitations of this particular processor there's still Amdahl's law, limiting the speedup by the serial parts of a task.
As one example how that works look at compiling to hardware. In theory this should bring enormous benefits as not only can one parallelize on a instruction level but on a sub-instruction one, speculating and pipelining e.g. additions. Many types of communication can be eliminated entirely by replicating hardware.
But even with those benefits there are a _lot_ of software that is better to run on a standard processor. Why? Because using custom optimized hardware to run it ends up replicating a number of normal processors including caches, branch prediction etc. and then a processor optimized by a dedicated team of experienced people ends up being attractive.
Now saying custom hardware can't bring huge benefits, not even saying that this research processor can't do it _however_ in general there are a lot of tasks that can't really be accelerated much.
I still wonder how long it will be until the 'traditional host CPU' is scaled down to a small SOC, so that the traditional heavyweight CPU is freed up for tasks that actually require it: most of what runs on the i5 in the machine I am writing this on doesn't need anything remotely as powerful as said i5. Likewise, putting a small SOC-like chip in the graphics card and running most of the GUI there is another thing. As such, once processors hit the single core brick wall (and they're kind of doing that now), performance improvements will come from offloading what can run on a small power-efficient core to such a small power-efficient core. Given what the chip in e.g. a pi zero costs, it ought to make sense: connect your machine to power, and a tiny microcontroller handles the ILO and basic system management functions, and on power-on, a larger microcontroller/SOC does what the BIOS/UEFI does on current machines. Similarly in the screen we have the same arrangement, with a microcontroller starting up the GPU and display (independently of the rest of the machine). A modern PC is already like a small network (the GPU being networked to the main CPU via the pcie bus, multiple intel sockets networked via QPI etc.). Making this more explicit is the sensible thing to do.
John_Chalisque