Slashdot Mirror


MIAOW Open Source GPU Debuts At Hot Chips

alexvoica writes: The first general-purpose graphics processor (GPGPU) now available as open-source RTL was unveiled at the Hot Chips event. Although the GPGPU is in an early and relatively crude stage, it is another piece of an emerging open-source hardware platform, said Karu Sankaralingam, an associate professor of computer science at the University of Wisconsin-Madison. Sankaralingam led the team that designed the Many-core Integrated Accelerator of Wisconsin (MIAOW). A 12-person team developed the MIAOW core in 36 months. Their goal was simply to create a functional GPGPU without setting any specific area, frequency, power or performance goals. The resulting GPGPU uses just 95 instructions and 32 compute units in its current design. It only supports single-precision operations. Students are now adding a graphics pipeline to the design, a job expected to take about six months.

45 comments

  1. Hot Chips are for CATS by Anonymous Coward · · Score: 1, Insightful

    You are all Cats! Cats say Miaow! Miaow cats MIAOW! MIAOW say the cats. YOU CATS!!!

  2. General purpose by Anonymous Coward · · Score: 0

    graphics processor? You mean you can process all kinds of porn with it?

    Sounds potentially useful.

    1. Re:General purpose by dsmatthews9379 · · Score: 1

      No you will need to wait until they upgrade it to 34 cores before you can do that.

    2. Re:General purpose by tepples · · Score: 1

      graphics processor? You mean you can process all kinds of porn with it?

      Given the name, you can probably process kitty porn.

  3. GPGPU by hsa · · Score: 2

    Isn't the secong G graphics? If graphics pipeline is missing, this is just a multicore CPU..

    1. Re:GPGPU by Bengie · · Score: 1

      We should stop treating graphics as something special. We just need a compute unit that can just so happen to convert buffers into a signal a monitor can understand.

    2. Re:GPGPU by Anonymous Coward · · Score: 2, Insightful

      Ah the old RISC versus CISC argument again. Graphics is very specialized. Having a stack of relatively simple graphics cores does very helpful things that your average 4 to 8 core CPU just can't, at least not in remotely that speed. Maybe we will eventually have enough cores in an average cpu to somehow make it possible to do the same things. On the bright side it seems CPUs are again fast enough to decode most 1080p video streams without video card assist, so maybe we are not as far as we might otherwise be. I doubt 4k video is going to easily be decoded with today's CPUs, and complex games won't happen.

      Still, it is an interesting idea. Is there some fundamental "core" we can make processors up of that will be great at handling video as well as number crunching, while not being hopelessly complex.

    3. Re:GPGPU by Anonymous Coward · · Score: 0

      The "pipeline" that is missing just means that each core processes each instruction completely before starting on the next instruction. A pipelined architecture means it can handle multiple instructions at different phases at the same time. Pipelined is just faster.

    4. Re:GPGPU by Anonymous Coward · · Score: 1

      Exactly. GPGPU cores are very specialized:

      Intel i7, 2.6B transistors, 8 cores
      AMD Cayman, 2.6B transistors, 1536 cores

    5. Re:GPGPU by jonwil · · Score: 1

      The plans Intel had for Larabee seemed like a good idea. Take an old Pentium core, add a bunch of fast special-purpose instructions specifically designed for doing the sorts of operations that 3D graphics require, stick a bunch of these cores on a single chip and add a few special blocks for certain operations (as well as stuff to actually display stuff on the screen)

      It sounded like an interesting idea (and would have been a LOT more open than anything from AMD or NVIDIA) but Intel decided to cancel the project because they didn't think they could match AMD or NVIDIA on price.

    6. Re:GPGPU by Bengie · · Score: 1

      GPU "cores" are quite different than CPU cores. I don't want them to be like CPU cores, I want GPUs to be dumb number-crunching vector-manipulating computing units.

    7. Re:GPGPU by Anonymous Coward · · Score: 0

      Specialized, they are. Cores they are not.

    8. Re:GPGPU by Guspaz · · Score: 1

      The project to produce a GPU from Larabee was cancelled, but Larabee itself simply morphed into Intel MIC and they've released several generations of it to market. They now use the brand name "Xeon Phi" for it.

    9. Re:GPGPU by Anonymous Coward · · Score: 0

      well. it is. the amount of internal dependency in the computation is limited to the final composition.

      so that means you can apply super-wide parallelism and provision fat, non-coherent memories and
      really go to town - use architectures which are effectively useless on more general problems.

      so, sure, i'd rather have 1k non-coherent risc cores on a die with some dedicated memory for
      solving most interesting problems with low memory footprint and throughput

      but for this a whole lotta vector pipes is worth a lot of bang for your buck

    10. Re:GPGPU by Anonymous Coward · · Score: 0

      Um, the hopelessly complex is an oxymoron in any design that must handle code that branches, ie the general purpose code that make CPUs actually useful. However, I am partial to The Mill family of designs, in which you take up to 30 or so instructions in one cycle. With multiple cores of this, you could get a more useful CPU with GPU speed.

      Assuming all the claims bear out, since it hasn't been tabbed yet.

    11. Re:GPGPU by Kjella · · Score: 1

      On the bright side it seems CPUs are again fast enough to decode most 1080p video streams without video card assist, so maybe we are not as far as we might otherwise be. I doubt 4k video is going to easily be decoded with today's CPUs, and complex games won't happen.

      Actually that's just because people do crazy things in madVR, "normal" UHD decoding can be done in software (source):

      In a JCT-VC document NTT DoCoMo showed that their HEVC software decoder could decode 3840x2160 at 60 fps using 3 decoding threads on a 2.7 GHz quad core Ivy Bridge CPU.

      --
      Live today, because you never know what tomorrow brings
    12. Re: GPGPU by Anonymous Coward · · Score: 0

      No, you are wrong. They talk about the graphics pipeline, which is a sequence of processing steps that gets you from a soup of triangles and a set of shaders to the final image. There are several different stages involved and some of them must be implemented as specialised hardware to have any semblance of actual performance. Usually, this is the input assembly, tessellation (the actual subdivision step), the rasterizer, the (hierarchical) z buffer and the final blending stage. In addition, you also want to cache textures on the chip, which only works well when you exploit certain properties of the pipeline and its parallelism. Commercial GPUs have tons of additional special hardward on top of that, but most of that is cleverly hidden away.

      Long story short: a GPU in rasterization mode looks very different from a GPU in general purpose computing mode. The chip from the story only does the later so far.

    13. Re:GPGPU by fragMasterFlash · · Score: 1

      Larabee became Xeon Phi and Knights Corner based Xeon Phi cards have been on sale for quite some time. The next version will be Knights Landing and is supposed to go public sometime this year, IIRC.

    14. Re:GPGPU by Tough+Love · · Score: 1

      Failure to read even the article summary detected. The students expect to add a graphics pipeline in 6 months.

      Doing the GPGPU first was a brilliant idea, it gets the project to a state of doing something useful way sooner than going for the graphics pipeline first.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    15. Re:GPGPU by Tough+Love · · Score: 1

      Congratulations, you just reinvented the modern graphics pipeline.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    16. Re:GPGPU by Anonymous Coward · · Score: 0

      Open Source hardware designs might excite many more people if there were 3D printers that could output the result cost-effectively.

      I hope that considerable effort goes into energy management and efficiency. If the peak consumption has to be high that's okay as long as it can drop to near-nothing when demands are simple.

      Instead of buying ink cartridges, I'd just like to load sand and scrap metal, maybe even bottles and cans. With input like that, print up some vacuum tubes too.

    17. Re:GPGPU by Tough+Love · · Score: 1

      Open Source hardware designs might excite many more people if there were 3D printers that could output the result cost-effectively.

      Apparently, you do not have the slightest clue about ASICS or FPGAs.

      --
      When all you have is a hammer, every problem starts to look like a thumb.
    18. Re:GPGPU by hattig · · Score: 1

      The main purpose of dedicated hardware video decoders (and encoders) is to save power. Not because it can't be done on the CPU in adequate time. In addition, it saves money from not needing a powerful CPU to drive the process. They are a highly specialised form of hardware.

      GPUs do both - they save overall power (compared to a massive clusters of CPUs that could do the same calculations at the same rate), and also they allow the graphics to occur in real time (which a single CPU can't) at a cost similar to a single CPU. GPUs are specialised, but still quite generic so they can be used for compute tasks as well. Video decoders (and encoders) can be implemented on the GPU too, but the power savings from having a dedicated hardware unit are still worthwhile.

      HEVC is designed to put most of the work into the encoding stage, hence why it can be decoded in software on a 50W+ CPU.

  4. we must know by nimbius · · Score: 1

    as slashdotters the question must be asked...Will this GPU run Crysis...

    --
    Good people go to bed earlier.
    1. Re:we must know by UnknownSoldier · · Score: 1

      I'd rather see a 3DMark Vantage/11/13 figure so we can gauge how fast (or slow) it is. The Valley Benchmark would be good to see too.

  5. University Project by craighansen · · Score: 4, Interesting

    I attended the presentation for this chip, and as multiple audience questioners pointed out, this design hasn't been carefully designed to be clear of patents. As a university project, it's not likely to be an issue, but cribbing from a recent GPU design is not a promising way to get a patent-clear open-source hardware design. It's also not complete, as it's missing graphics-specific functions, such as texture-mapping, and the FPGA implementation had a single processing pipeline. By taking the same instruction set, they made it easier to test and operate their design using AMD's tools. All that being said, it's an impressive start for a small university group, and by enabling operation with instrumentation hooks for measuring dynamic operations, may become useful as benchmarking and measurement tool for GPGPU programs. Just don't expect this to displace commercial designs RealSoonNow.

    1. Re:University Project by Anonymous Coward · · Score: 0

      Build it in China, sell it from China. It's kind of amazing that the patent system is broken in the whole world except in China when it comes to something like this.

    2. Re:University Project by nickweller · · Score: 1

      "I attended the presentation for this chip, and as multiple audience questioners pointed out, this design hasn't been carefully designed to be clear of patents."

      What patent violations did the Hot Chips audience members ask about. Who asked the questions. Who were the questions directed to. What was the response?

    3. Re: University Project by Anonymous Coward · · Score: 0

      I forgot who asked, but I think it was a tech website blogger. The first issue was if the terms of the compiler allows for use on non-amd hw. The second was the instruction set. The presenter made a joke about them running it by some amd folks unofficially and then basically said if someone else implemented this it might be good to have some other patents to hit back with in case amd or nvidia came knocking.

      So basically they don't know and don't care ;^)

      What followed was much snickering from the audience.

    4. Re: University Project by Anonymous Coward · · Score: 0

      Non enforced != Not broken.

      They actually have a system. If you have enough influence, you can ignore it or have the government intervene and force a low royalty rate or have your patent overturned.

    5. Re:University Project by I+AOk · · Score: 1

      > Nathan Brookwood, principal of market watcher Insight64

      --
      [iconv --from-code=utf-7]
  6. Nyuzi was first and is better by Theovon · · Score: 5, Interesting

    Check out "nyuzi.org". This is a fully functional open source GPU. It's synthesizable Verilog and works already in an FPGA. So not only is it more or less complete, but it also came out before MIAOW.

    1. Re:Nyuzi was first and is better by smallfries · · Score: 1

      Could you not even link to the fully functional open source GPU so that the lazy but curious could click, and google could perhaps realise that it exists?

      OK, I take that back. WTF has happened to links in the comment submission box? They've finally done. Those crazy bastards have destroyed slashdot.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    2. Re:Nyuzi was first and is better by Anonymous Coward · · Score: 1

      As suggested... http://nyuzi.org/

    3. Re:Nyuzi was first and is better by KGIII · · Score: 1

      http://nyuzi.org/

      Maybe they've disallowed ACs to link unless they do the whole markup?

      --
      "So long and thanks for all the fish."
    4. Re:Nyuzi was first and is better by alvieboy · · Score: 1

      nyuzi: "It is running on a single core at 50Mhz on a Cyclone IV FPGA. "

      Not too bad, but still far from fast (I consider everything 80MHz on this family to be slow). Perhaps a bit more pipelining would help.

      Regarding TFA, there seem to be no frequency numbers, and I see they borrow much from OR1200. Last time I synthesized OR1K, it was painfully slow (like 8MHz on a Spartan-3E device). I think it has evolved in this area though.

      And I hate Verilog. I always wonder why they do not use VHDL. But it's a matter of opinion, I guess.

      Alvie

    5. Re:Nyuzi was first and is better by smallfries · · Score: 1

      I'm not an AC, I did the whole markup and I tried flipping the various settings but nothing would change it.

      I mean, obviously the've fixed it now in some sort of cover-up... testing.

      --
      Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
    6. Re:Nyuzi was first and is better by Anonymous Coward · · Score: 0

      (I'm the author of Nyuzi)

      nyuzi: "It is running on a single core at 50Mhz on a Cyclone IV FPGA. "

      Not too bad, but still far from fast (I consider everything 80MHz on this family to be slow). Perhaps a bit more pipelining would help.

      Fmax is actually a bit north of 60Mhz. I'm running it at 50 because it's a convenient multiple of the VGA dot clock and allows me to have everything in one clock domain. It's actually pipelined fairly deeply (12 stages total for floating point operations), but I haven't spent much time optimizing for clock speed. I'm sure some combinational logic cleanup, floor planning, and tool option tweaks could improve performance. FPGA is more of a test environment and hasn't been a priority so far.

      And I hate Verilog. I always wonder why they do not use VHDL. But it's a matter of opinion, I guess.

      For me, it seems to have better open source tools. I've been using Verilator (http://www.veripool.org/wiki/verilator), which is fast, stable, and has good SystemVerilog support.

  7. Williow Witching. by westlake · · Score: 1

    Check out "nyuzi.org". It came out before MIAOW.

    What is it about the geek that compels him to bury his most interesting projects somewhere south of the The Ark of the Covenant? Never leaving behind the faintest clew to what it does or where it might be found.

    Traditionally, the most common dowsing rod is a forked (Y-shaped) branch from a tree or bush. The dowser then walks slowly over the places where he suspects the target (for example, minerals or water) may be, and the dowsing rod dips, inclines or twitches when a discovery is made. This method is sometimes known as "willow witching".

  8. Can I haz GPU? by khelms · · Score: 1

    nt

  9. Not opensource! by SuperDre · · Score: 1

    this GPGPU isn't opensource at all, it's using a design by AMD which was never opensourced, and it's using patented technology.. So if you're gonna create one, you're gonna have to pay a lot of license fee's...

    1. Re:Not opensource! by Tough+Love · · Score: 1

      mods, see the troll who says black is white

      --
      When all you have is a hammer, every problem starts to look like a thumb.