AMD Introduces Radeon Instinct Machine Intelligence Accelerators (hothardware.com)
Reader MojoKid writes: AMD is announcing a new series of Radeon-branded products today, targeted at machine intelligence and deep learning enterprise applications, called Radeon Instinct. As its name suggests, the new Radeon Instinct line of products are comprised of GPU-based solutions for deep learning, inference and training. The new GPUs are also complemented by a free, open-source library and framework for GPU accelerators, dubbed MIOpen. MIOpen is architected for high-performance machine intelligence applications and is optimized for the deep learning frameworks in AMD's ROCm software suite. The first products in the lineup consist of the Radeon Instinct MI6, the MI8, and the MI25. The 150W Radeon Instinct MI6 accelerator is powered by a Polaris-based GPU, packs 16GB of memory (224GB/s peak bandwidth), and will offer up to 5.7 TFLOPS of peak FP16 performance. Next up in the stack is the Fiji-based Radeon Instinct MI8. Like the Radeon R9 Nano, the Radeon Instinct MI8 features 4GB of High-Bandwidth Memory (HBM) with peak bandwidth of 512GB/s. The MI8 will offer up to 8.2 TFLOPS of peak FP16 compute performance, with a board power that typical falls below 175W. The Radeon Instinct MI25 accelerator will leverage AMD's next-generation Vega GPU architecture and has a board power of approximately 300W. All of the Radeon Instinct accelerators are passively cooled but when installed into a server chassis you can bet there will be plenty of air flow. Like the recently released Radeon Pro WX series of professional graphics cards for workstations, Radeon Instinct accelerators will be built by AMD. All of the Radeon Instinct cards will also support AMD MultiGPU (MxGPU) hardware virtualization technology.
What can I do with these new "AI cards"? Specific stuff, preferably which makes me money. Not abstract, generic buzzwords, please.
How well do they run CUDA? From what I've done so far with ML/NNs it's CUDA all the way.
Almost all "How to use GPU for ___" come back with CUDA instructions first and OpenCL is nowhere near as close.
Looking at the tensorflow open tickets it's still very much a work in progress: https://github.com/tensorflow/...
Every time I see "16 GB of memory" on a GPU card, I have to ask the same question... Is all 16GB addressable? I've never been 'not' disappointed before.
In own words of AMD driver developer:
"We don't happen to have the resources to pay someone else to do that for us."
https://lists.freedesktop.org/...
AMD does hardware, but they dont support it with software.
Who logs in to gdm? Not I, said the duck.
None of this will matter much until they get full, out-of-the-box OpenCL support in major deep learning libraries like TensorFlow, Theano, and Torch.
Just hire some people to do this and watch your sales shoot up.
So they're all excited about the lowest-precision, smallest-size floating point math in IEEE 754?
Not only that, but FP16 is intended for storage (of many floating-point values where higher precision need not be stored), not for performing arithmetic computations.
Kudos to AMD's marketing department for boasting about their compute performance with a number format that was never meant for computation.
Tell them to get back to me with their 64, 128, and 256-bit IEEE floating point performance..
-- Sometimes you have to turn the lights off in order to see.
If they open their consumer products up to full, unrestricted GPGPU access they could have a chance by getting their tools in to the hands of programmers on the cheap.
Nvidia gates the good stuff behind expensive product lines because they can get away away with it. The only difference between the consumer gear is preferential binning, memory amounts, and configuration fuses that enable/disable features (Many of which are software only). The silicon is the same.
AMD could make inroads if they enable the small players to get the same results with generic gaming hardware. - Of course without the benefits of professional level support and mature frameworks, which is why Nvidia is head in the business and research space.
AMD needs to get their foot in the door because right now they're behind.
Ayy
I hear the Radeon Indistinct will be using fuzzy logic.
The only problem with MxGPU is that all the virtualization vendors see it as a value-add so it's going to be an extra cost item.
The real bet in massively parallel processing is one: cost per flop.
If this is not much lower than current cpu/gpu configurations for a typical pc/server, even if these boards are super-fast for their scale, they will fail as a product in market.
Non-scalable (due to cost) massively parallel processing is only for research institutions, DoE and NSA that have enough money to spend on supercomputers anyway.