Google's Custom Machine Learning Chips Are 15-30x Faster Than GPUs and CPUs (pcworld.com)

← Back to Stories (view on slashdot.org)

Google's Custom Machine Learning Chips Are 15-30x Faster Than GPUs and CPUs (pcworld.com)

Posted by msmash on Wednesday April 5, 2017 @07:20AM from the pushing-the-boundaries dept.

Four years ago, Google was faced with a conundrum: if all its users hit its voice recognition services for three minutes a day, the company would need to double the number of data centers just to handle all of the requests to the machine learning system powering those services, reads a PCWorld article, which talks about how Tensor Processing Unit (TPU), a chip that is designed to accelerate the inference stage of deep neural networks came into being. The article shares an update: Google published a paper on Wednesday laying out the performance gains the company saw over comparable CPUs and GPUs, both in terms of raw power and the performance per watt of power consumed. A TPU was on average 15 to 30 times faster at the machine learning inference tasks tested than a comparable server-class Intel Haswell CPU or Nvidia K80 GPU. Importantly, the performance per watt of the TPU was 25 to 80 times better than what Google found with the CPU and GPU.

49 of 91 comments (clear)

Min score:

Reason:

Sort:

I for one, by Bodhammer · 2017-04-05 07:23 · Score: 3, Funny

Welcome our new Google overlords. (or whatever...)

--
"I say we take off, nuke the site from orbit. It's the only way to be sure."
1. Re:I for one, by Hognoxious · 2017-04-05 08:14 · Score: 3, Funny
  
  No point saying it, dude. They already know.
  
  --
  Confucius say, "Find worm in apple - bad. Find half a worm - worse."
2. Re:I for one, by R3d+M3rcury · 2017-04-05 09:20 · Score: 1
  
  I wonder where they got the idea for a custom machine learning chip...
  Oh. Shit.
A purpose built chip by Anonymous Coward · 2017-04-05 07:28 · Score: 5, Insightful

outperforms general purpose chips?
Wow.
1. Re: A purpose built chip by Anonymous Coward · 2017-04-05 07:44 · Score: 1
  
  My thoughts exactly. How is it even the least but surprising that custom silicon is better than general purpose
2. Re:A purpose built chip by MightyYar · 2017-04-05 07:58 · Score: 1
  
  While there is some truth to that, this "purpose built" chip's purpose is to run an open-source AI language. So this is more interesting than a typical custom ASIC.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
3. Re:A purpose built chip by NatasRevol · 2017-04-05 08:02 · Score: 1
  
  ASICs everywhere are embarrassed by how slow the TPUs are.
  
  --
  There are two types of people in the world: Those who crave closure
4. Re:A purpose built chip by NatasRevol · 2017-04-05 08:04 · Score: 1
  
  How so?
  Custom chip built for how the bits are handled is .... typical of an ASIC.
  
  --
  There are two types of people in the world: Those who crave closure
5. Re: A purpose built chip by Anonymous Coward · 2017-04-05 08:10 · Score: 1
  
  I suppose the application is *slightly* more broad than a typical ASIC but yea, looks like some marketing article.
6. Re:A purpose built chip by ShanghaiBill · 2017-04-05 08:58 · Score: 4, Informative
  
  How so?
  The TPU is a "purpose built" chip, but that purpose is very broad. It is optimized for massively parallel low-precision matrix operations, which is useful not only for neural nets, but also simulation of physical processes like CFD, weather prediction, climate models, computational chemistry, etc. It can do everything a GPU can do except the rasterization and texture mapping, but it can do it faster and with much less power.
7. Re: A purpose built chip by ChristopherSkinner · 2017-04-05 09:42 · Score: 1
  
  GPU chips are a bunch of multipliers. Floating point, which is, of course, integer , with an exponential. The press release talks about "inference" as if this a math function. My guess is Weighted multipliers. ...If Ggle have in fact made more efficient multipliers, then you gamers can turn off those noisy fans. Or they faster? It seemed to be saying they were more efficient.
8. Re:A purpose built chip by Baloroth · 2017-04-05 10:12 · Score: 4, Informative
  
  It is optimized for massively parallel low-precision matrix operations, which is useful not only for neural nets, but also simulation of physical processes like CFD, weather prediction, climate models, computational chemistry, etc.
  Maybe, but I doubt it. It's far too low precision, for one thing: 8-bits doesn't get you very far in any of those fields (you typically want at least 32-bit FLOPS for those, and quite often 64-bit precision is required, as numerical errors accumulate exponentially in a chaotic system), and they're really not even big matrices (just 256x256). Really the only place this kind of thing would excel is signal processing, which is basically what they're using them for.
  
  --
  "None can love freedom heartily, but good men; the rest love not freedom, but license." --John Milton
9. Re:A purpose built chip by Tough+Love · 2017-04-05 10:58 · Score: 1
  
  It takes considerable organizational effort to push an ASIC all the way through through the pipe from design to production. Even budgeting and staffing are nontrivial. The technology might not be earth shattering, but the engineering process is respectable. And who knows, the technology might be earth shattering. But probably not. It uses numerical methods, analog would be faster and more interesting.
  
  --
  When all you have is a hammer, every problem starts to look like a thumb.
10. Re:A purpose built chip by psmoot · 2017-04-05 12:10 · Score: 1
  
  I can't count how many times someone thought they could build an ASIC to optimize some computation, only to get crushed by Moore's Law operating on general purpose CPUs.
  Never fight a land war during a Russian winter. Never bet against Ethernet. Never bet against Moore's Law.
11. Re:A purpose built chip by MightyYar · 2017-04-05 12:12 · Score: 1
  
  By your broad definition, a GPU is also a "typical ASIC".
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
12. Re: A purpose built chip by ArmoredDragon · 2017-04-05 13:08 · Score: 1
  
  Not sure why they didn't just call it what it is: ASIC.
13. Re: A purpose built chip by Tough+Love · 2017-04-05 15:23 · Score: 1, Insightful
  
  Actually, what is really surprising is that Google considered the project worth doing to get only 15-30% advantage vs GPU, if those numbers are accurate. In the best case, this buys roughly an 18 month advantage before GPUs get faster and the engineering has to be done all over again, or the project will just go the way of other Google abandonware. And in that brief window, do saved operating costs justify the sunk engineering and fabrication cost? I doubt it.
  Now, on second look, this smells like a vanity project more than anything.
  
  --
  When all you have is a hammer, every problem starts to look like a thumb.
14. Re: A purpose built chip by religionofpeas · 2017-04-05 18:31 · Score: 1
  
  They are 15-30 times faster, not 15-30%. That's a huge difference. And this is only the first version, so it is likely that the TPU can be improved faster than GPUs that have been on the market for years.
15. Re:A purpose built chip by religionofpeas · 2017-04-05 18:34 · Score: 1
  
  It will take a while for Moore's law to catch up with 15-30 times speed improvement, and even better power improvement.
  And Moore's law also helps this chip.
16. Re:A purpose built chip by religionofpeas · 2017-04-05 19:24 · Score: 1
  
  It's not that obvious when you're talking about floating point calculations in combination with external memory. A GPU is highly optimized for both of those requirements, and it's not all that simple to make an ASIC that does this better. The main reason Google got such an improvement is because the require much less precision in their results.
17. Re: A purpose built chip by Tough+Love · 2017-04-06 00:54 · Score: 3, Informative
  
  They are 15-30 times faster, not 15-30%.
  Every little order of magnitude really helps :)
  
  --
  When all you have is a hammer, every problem starts to look like a thumb.
18. Re: A purpose built chip by s_p_oneil · 2017-04-06 00:55 · Score: 2
  
  I'm sure you've already seen the "it's 15-30 times, not 15-30 percent" replies. There's also the "performance per watt of the TPU was 25 to 80 times better". Can you imagine how much money this can save Google in electricity costs? It's 1-2 orders of magnitude better (10-100 times), with the possibility that they will continue to find dramatic improvements.
  If we equate your assessment with a "bunt", what Google really did is knocked the ball out of the park.
19. Re:A purpose built chip by David_Hart · 2017-04-06 02:43 · Score: 1
  
  By your broad definition, a GPU is also a "typical ASIC".
  Yep... A GPU is basically an ASIC as well. It's programmed to do one thing well. That it can be used for other things that use similar calculations was pure coincidence (i.e. bitcoin mining).
20. Re:A purpose built chip by NatasRevol · 2017-04-06 03:57 · Score: 1
  
  Well, since it's not a CPU, yes it's an ASIC.
  
  --
  There are two types of people in the world: Those who crave closure
21. Re: A purpose built chip by Jayfar · 2017-04-06 03:59 · Score: 1
  
  Not sure why they didn't just call it what it is: ASIC.
  Well TFU kinda did that: "TPUs are what’s known in chip lingo as an application-specific integrated circuit (ASIC)."
22. Re:A purpose built chip by psmoot · 2017-04-06 06:08 · Score: 1
  
  I know, that's what makes this a remarkable achievement. Many times in the past people have tried to do this but it took much longer than they anticipated. In the mean time, Intel, AMD, nVidia, ATI, or whoever managed to catch up and surpass the ASIC. It turned out the performance win from ASICs had a shorter shelf life than people realized.
  It seems times have changed. Google has a very specific workload which appears to be different from what the mainstream processors have optimized for. The easy (easier?) part of Moore's Law (crank cycle times and add ALUs/FPUs) seems over. As a result, Google has a great win, yay for them. If this turns out to be a large market, I'll be surprised if neural net circuits don't show up in mainstream CPUs and GPUs in the next few generations. Then it becomes a race to see who can optimize the fab process fastest.
23. Re:A purpose built chip by MightyYar · 2017-04-06 08:36 · Score: 1
  
  Agreed that a GPU is an ASIC. I don't think they are "typical" in a few senses, but mainly the crowd around here go crazy over them.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
24. Re:A purpose built chip by MightyYar · 2017-04-06 08:37 · Score: 1
  
  Not sure where I said otherwise.
  
  --
  W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Wait you mean an ASIC is fast? Why I never! by Sycraft-fu · 2017-04-05 07:39 · Score: 5, Informative

Man is this a "duh" moment. Purpose built ASICs are extremely fast and low power for what they accomplish. That's why we use them. Look at a small desktop network switch: Little tiny processor that can pass 16gb/sec of traffic around. try and put 8 NICs in a computer and have it switch traffic and you'll be amazed at how much power you need. The reason the switch is small is it is purpose built: It's ASIC does nothing but switch Ethernet packets.
Same deal with some thing on a CPU. You find that decoding an AVC video stream takes next to no CPU power on modern CPUs, yet decoding an MPEG-2 video takes some. Why? Because they have a small bit of dedicated logic for AVC decoding (usually some other formats too). It is low power because it is dedicated.
Always the question in designing a system is flexibility and unit cost vs fixed function and up front cost. A CPU is great because it can do anything, and you can just buy them straight out, tons of companies have them available for purchase right now. However they take a lot of silicon and power to perform a given task. An ASIC takes a bunch of up front money to design and do a manufacturing run, but is very small and efficient, however it can't be reconfigured to do anything else and needs a full respin. In the middle there is something like an FPGA. Which one is right for a application just depends on the balance of a lot of factors.
1. Re:Wait you mean an ASIC is fast? Why I never! by chispito · 2017-04-05 07:54 · Score: 3, Insightful
  
  An ASIC takes a bunch of up front money to design and do a manufacturing run, but is very small and efficient, however it can't be reconfigured to do anything else and needs a full respin.
  Per TFA, the chips they designed are flexible enough to apply to new machine learning models. I think the point is that this was a space ripe for customized architecture, like graphics cards were 15-20 years ago.
  
  --
  The Daddy casts sleep on the Baby. The Baby resists!
2. Re:Wait you mean an ASIC is fast? Why I never! by Nukenbar · 2017-04-05 08:29 · Score: 2
  
  There is a reason that all of the Bitcoin miners are ASIC based now.
  Don't expect those machines do be able to do anything else though if bitcoin dies off.
3. Re:Wait you mean an ASIC is fast? Why I never! by thegarbz · 2017-04-05 08:54 · Score: 1
  
  Purpose built ASICs are extremely fast and low power for what they accomplish
  And they have very specific algorithms. Certainly nothing traditionally resembling "machine learning".
Say what... by __aaclcg7560 · 2017-04-05 07:42 · Score: 1

I thought the TPU was for hard drive encryption. Or is it doing double duty?
1. Re:Say what... by itsownreward · 2017-04-05 07:50 · Score: 4, Informative
  
  You're thinking of a TPM. This is a TPU.
15-30x the speed by fred6666 · 2017-04-05 08:05 · Score: 2

But 1000x as expensive?
1. Re:15-30x the speed by religionofpeas · 2017-04-05 08:39 · Score: 1
  
  Energy cost is lower, and those will be dominant over longer term.
Prime directive! Bah! by kimgkimg · 2017-04-05 09:15 · Score: 1

Oh good, so our dystopian future can be realized just that much faster then...
amazing! by tommeke100 · 2017-04-05 10:09 · Score: 1

but how many fps does it get running the new Mass Effect? Oh it can't?
No, they don't by fyngyrz · 2017-04-05 10:35 · Score: 1

Algorithms and processor sets are not artificial intelligence and neural networks
That's like saying a software defined radio is not a radio.
It's right -- but it's also completely wrong.
And the important part in the context here... yeah, the completely wrong part.
You can create a perfectly fine neural network with a general purpose von Neuman or Harvard architecture CPU. Speed and efficiency are issues, that's all, and that's what the TPU is designed to address.

--
I've fallen off your lawn, and I can't get up.
Inherent stratification by fyngyrz · 2017-04-05 10:39 · Score: 1

Energy cost is lower, and those will be dominant over longer term.
This is likely another demonstration of "those who have the money, make more money."
Solar panels: You can save all kinds of money. If you can afford to install the system in the first place.
Investments: You can make all kinds of interest. If you have money to invest.
Toilet paper: You can save lots of money. If you buy it in bunches on sale. But if you can't spare the funds... your TP costs more than the person with a few bucks to spare who buys it in bulk. Likewise has storage space for it, etc.
And so on.

--
I've fallen off your lawn, and I can't get up.
Re: Performance bottleneck by aussie_a · 2017-04-05 11:23 · Score: 1

It's not about training the neural network. It'd about data mining and monetizing the customers. That's why everything Google phones home.
Machine language example? by Tablizer · 2017-04-05 12:06 · Score: 1

What does the machine language for these things look like? Does anybody know of a bare-bones example to illustrate how it does a simple sample neural net? Is it only for the offset shifting kind of NN's common for language AI, or other kinds also?

--
Table-ized A.I.
There, I fixed it for you. by JustNiz · 2017-04-05 13:30 · Score: 1

>> Google's Custom Machine Learning Chips Are 15-30x Faster Than GPUs and CPUs AT MACHINE LEARNING
There, I fixed it for you.
1. Re: There, I fixed it for you. by religionofpeas · 2017-04-05 18:35 · Score: 1
  
  Thanks for fixing, but it was obvious for everyone else.
Re:Does this add up (CPU vs GPU vs TPU?) by bugs2squash · 2017-04-05 14:59 · Score: 1

maybe it's a task that is not well suited to the GPU, so it performs little better than general purpose hardware.

--
Nullius in verba
Will the chip be available to non-Googlers? by wisebabo · 2017-04-05 17:15 · Score: 1

(Disclaimer, not an AI or machine learning expert but interested in learning!)
So will this chip (or board) be available outside of google? I've heard they've released (some of) their AI/Machine learning code, would be good if once you made a working application you could buy one of these things and speed it up. Would be especially useful for applications where access to the cloud was unavailable or intermittent at best (think self driving cars, drones, spacecraft).
I guess a PCI card that would go in a server would be best but maybe a dedicated peripheral could work
Any other companies working on similar hardware? Are there any standards, like Open GL for AI?
1. Re:Will the chip be available to non-Googlers? by religionofpeas · 2017-04-05 18:36 · Score: 1
  
  Check out Tensorflow.
2. Re:Will the chip be available to non-Googlers? by wisebabo · 2017-04-06 00:35 · Score: 1
  
  So they've open sourced the software? That's good, but no chip will be available?
Re:15x by trevc · 2017-04-06 04:37 · Score: 1

not much