Slashdot Mirror


Ask Slashdot: How Many (Electronics) Gates Is That Software Algorithm?

dryriver writes "We have developed a graphics algorithm that got an electronics manufacturer interested in turning it into hardware. Here comes the problematic bit... The electronics manufacturer asked us to describe how complex the algorithm is. More specifically, we were asked 'How many (logic) gates would be needed to turn your software algorithm into hardware?' This threw us a bit, since none of us have done electronics design before. So here is the question: Is there a piece of software or another tool that can analyze an algorithm written in C/C++ and estimate how many gates would be needed to turn it into hardware? Or, perhaps, there is a more manual method of converting code lines to gates? Maybe an operation like 'Add' would require 3 gates while an operation like 'Divide' would need 6 gates? Something along those lines, anyway. To state the question one more time: How do we get from a software algorithm that is N lines long and executes X number of total operations overall, to a rough estimate of how many gates this algorithm would use when translated into electronic hardware?"

21 of 365 comments (clear)

  1. Holy crap by CajunArson · · Score: 5, Insightful

    Either implement it as shaders for a GPU (or a DSP) or hire somebody who actually knows about hardware design if you are hell-bent on implementing an ASIC.

    Slashdot: Where *not* to go to get specific advice about specific technical issues.

    --
    AntiFA: An abbreviation for Anti First Amendment.
    1. Re:Holy crap by Megane · · Score: 5, Funny

      And to think, they rejected my Ask Slashdot submission on how to find a cheat code on my bank's web site for unlimited moneys

      Just walk up to any ATM and press: up up down down left right left right B A start.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    2. Re:Holy crap by Joce640k · · Score: 5, Insightful

      Just 'fess up and say "We don't know, we're software people, not hardware people".

      If it's really important they might offer some help.

      --
      No sig today...
    3. Re:Holy crap by Goaway · · Score: 5, Insightful

      This is the only sane answer. They probably only asked to find out if you happened to know.

      Say you don't know, and let them look at the code to figure it out.

    4. Re:Holy crap by Austerity+Empowers · · Score: 5, Informative

      To give a more helpful, unhelpful answer, it's an ill-formed question. "How many gates" depends on the target on which you synthesize the hardware: a PCB, an FPGA, actual silicon (which fab? Which process? whose std cell library? what clock frequency?).

      If somehow the above could be narrowed down by asking the customer, then the next thing I'd advise is contracting someone who can write RTL using an HDL (verilog is most popular). The synthesizeable subset of HDL is tricky to learn for non-HW people, so unless you understand digital logic well I'd suggest finding someone else to do it for you. They can then synthesize it to the targeted device/platform. If you can do this, you should charge quite a lot of money since this form of IP is expensive, and they know it. If they're ok with that, you may also want to have this contractor also write the design verification suite, since this company will certainly want that to integrate into their own testing. Lots of contractors are out there for this due to the cyclic nature of this job, make sure you also have some support feature in place if you need them to fix/update the code later.

      Even simple software algorithms can be very big in HW, but some surpisingly complex SW algorithms are next to 1 liners in HW (like any form of bit masking or bit swizzling is free!). But generally if there are a lot of sequential steps, and those steps are different...it gets big. Also assume that for every 1 SW guy that wrote the code, you will need 1 RTL designer. If you take the verification step, it may be 1-2 verification engineers for 1 RTL, depending on your timeline.

  2. Why don't they know? by Anonymous Coward · · Score: 5, Insightful

    You'd think the "electronics manufacturer" would have some idea how to estimate this.

    1. Re:Why don't they know? by janeuner · · Score: 5, Insightful

      They do have a way. They asked if it had already been determined.

      The correct response is, "We don't know."

  3. Just like any other software project by mbadolato · · Score: 5, Funny

    Make up a number, then when they complain that it was way off, blame it on their management changing scope a hundred times throughout the life of the project!

  4. Try Stackoverflow perhaps? by Anonymous Coward · · Score: 5, Insightful

    I think you may have a better chance of getting an answer if you ask this question on Stackoverflow (or one of its related sites).

    Unfortunately, I think asking on Slashdot is only likely to get you some tired and outdated memes / jokes...

  5. They shouldn't be asking you. by pavon · · Score: 5, Insightful

    If they plan on implementing this in hardware, then they should have people who are capable of answering that question. If instead, they are just a manufacturer and aren't capable of doing the actual hardware design, then you have bigger problems than answering this question. That is something you should find out about ASAP.

  6. Minecraft by nbetcher · · Score: 5, Funny

    Develop out the algorithm in Minecraft using ProjectRed (Integration module, specifically) and then you can easily count the gates! :-)

  7. Re:Verilog by Anonymous Coward · · Score: 5, Interesting

    if you only need a estimation, use something like bamboo from PandA to convert your C Code to Verilog. Then synthesize this code for a FPGA. In the summery you should find how many logic cells would be used as well as how many digital gates in an asics are necessary. This value is only a estimation, but for your question, this should work.

  8. It doesn't work like that by cjonslashdot · · Score: 5, Informative

    It's about more than gates. It is about registers, ALUs, gates, and how they are all connected. There are many different possible architectures, so it depends on the design: some designs are faster but take more real estate. There are algorithm-to-silicon compilers (I know: I wrote one for a product company during the '80s and it is apparently still in use today) but each compiler will assume a certain architecture. I would recommend one but I have been out of that field for decades.

  9. Easy calculation by Anonymous Coward · · Score: 5, Funny

    Here is a proven method for calculation.

    If your code is:
    a) C: divide the number of lines with 7
    b) C++: divide the number of lines with 5
    c) Ruby/Python/Java: divide the number of lines with 3
    d) Perl: multiply the number of lines with 42
    e) C#: resign.

  10. It's an optimization problem by swm · · Score: 5, Insightful

    You already have your algorithm running in electronic hardware, right?
    Your current gate count is the sum of
      * the gate count of your CPU
      * the gate count of your RAM
      * the gate count of your program ROM

    So that's an upper bound on the gate count.
    If that number is too big for your manufacturing partner,
    then you have an optimization problem.

    Optimization is a hard problem...

  11. The key to success. by ttucker · · Score: 5, Insightful

    Do not ask a computer scientist to be an electrical engineer.

  12. Re:Verilog by ranulf · · Score: 5, Informative

    The number of slices or logic cells or whatever else a particular synthesis program for a particular chip generates doesn't exactly correspond to a number of gates either. For instance, a single 4-in 1-out LUT on a Xilinx can be used for 1 gate or 6.

    I wouldn't have much confidence in automatic C to HDL conversion either. Good HDL design is about understanding the problem in terms of gates and parallelism. FPGAs and ASICs in general aren't particularly good at things that CPUs are good for, and inversely CPUs aren't especially good for things that FPGAs and ASICs can do well.

    The OP shows such a lack of understanding of hardware design that it's not funny! "Add = 3 gates, Divide = 6 gates" is quite comical to anyone who actually knows these things. A more ball park is that an n-bit add can be done with 2n LUTs, in terms of gates it's about 5n gates, but really that depends what gates you have available. A multiplier is massively more, dividing is even more complicated still. Fortunately, many FPGAs come with a few dedicator multipliers... Unless your algorithm requires only as many multipliers as you have available, you're probably best building a state machine and multiplexing a single multiplier unit, in much the same way as a CPU multiplexes the ALU at its core.

    The whole thing is massively dependent on algorithm and experience of the person doing the porting. The best advice is to say "I don't know" or to hire someone who does or suggest them running the algorithm on an embedded CPU.

  13. Re:Verilog by harrkev · · Score: 5, Informative

    Seriously???? Asking a C++ programmer to begin to use Verilog is simply not practical. There is a VERY STEEP learning curve in trying to target real hardware. There is even a very different frame of mind that has to be learned in order to target gates.

    I speak from experience. I program Verilog and SystemVerilog for a living doing ASIC design.

    Now, to answer the OP:

    The answer is very strongly: it depends. The most optimistic answer is a couple hundred thousand. Implement an 8-bit CPU and write the thing in under 32K of code.

    On the other end of the spectrum is "many billions." Design your own x86 multi-core CPU, throw a couple of gigs of SRAM on the ASIC, tons of flash for a solid-state disc drive, and you will have a complete high-end PC on a chip. Then add your software.

    Of course, these are both ridiculous extremes. Everything depends on the TYPE of operations being done. In a CPU a simple 32-bit multiply can be done with one character ("*"). In gates, if you need the answer in a single clock cycle, it can take an EXTREME amount of logic. However, if you are willing to wait 32 clock cycles for the answer, the amount of logic is reduced to a very manageable level. This is why C++ is a bad choice of input. How time-sensitive is it? Hardware is also very parallel in nature. Different parts of the chip can indeed be working on different things at the same time. You can go for a strictly pipelined architecture where each block does one little bit of the job and passes it off to the next block. High throughput, but lots of gates. Or you could design a general-purpose block and have it to everything slowly (the most extreme example of this approach is a common CPU).

    While I have heard of magic "C to gates" compilers, after almost 15 years in the business, I have never actually seen one. The closest that I have seen are tools that can turn Matlab code into (messy-looking) gates. If your algorithm is DSP in nature, this is a very viable alternative. Otherwise, the only advice that I can give you is to consult somebody who does hardware design for a living (like me).

    Otherwise, you really need to look at where the input comes from, where the output goes, and how fast you need to do the work.

    --
    "-1 Troll" is the apparently the same as "-1 I disagree with you."
  14. Knowing The Algorithm Is NOT Enough by SplawnDarts · · Score: 5, Insightful

    Knowing what algorithm you want to run in hardware in not even close to enough to estimate gates. You need to know the algorithm, and the required performance, and have a sketched out HW design that meets those goals. THEN you can estimate gate count.

    For a simple example of why this is, consider processors. A 386 and a Sandy Bridge i7 implement very similar "algorithms" - it's just fetch->decode->execute->writeback all day long. If you implemented them in software emulation, it would be very similar software with some additional bits for the newer ISA features on the i7. But a 386 is about 280 THOUSAND gates, and the i7 is about 350 MILLION gates/core - three orders of magnitude different. Of course, there's at least a 2 order of magnitude performance difference too - it's not like those gates are going to waste.

    Point is, knowing the algorithm isn't enough to get even a finger in the wind guess at gate count. If you need an answer to this question, you need to get competent HW design people looking at it.

  15. Re:Verilog by harrkev · · Score: 5, Interesting

    Oh, one more thing about "C to Gates" compilers. In the industry I have not seen one in actual use, but they do supposedly exist. However, they would only work in a limited domain.

    For example, if you have C++ that does simple control or DSP-type stuff, then it might work (cannot vouch for the quality of the results). On the other hand, if you get one of these compilers and try feeding it the source code for the Apache web server or the Quake engine source code, you are completely screwed.

    If your application is, say, a novel type of network filter that inspects and does something to Ethernet packets, you have to figure out how to interface your design with a real Ethernet SerDes .. which is a *LOT* different than opening up something in the "/dev/" directory. If your application is robotics, then you also need to get data into and out of the chip. How exactly is this done? How fast does the logic need to run? Is it speech processing? If so, then this will involve a lot of straight-forward DSP. If you constrain the design to tell it how fast the data needs to flow through, you should be able to get a reasonable estimate. Does your application need a lot of memory? If so, you might need some type of RAM controller. DRAM controllers can be hairy to work with, and you also have to consider latency and throughput.

    In theory, C to gates can work quite well, ***for a limited subset of applications***.

    HOWEVER: as others have pointed out, anybody who needs to know the answer to this question should be qualified to answer it for themselves.

    --
    "-1 Troll" is the apparently the same as "-1 I disagree with you."
  16. Re: Verilog by harrkev · · Score: 5, Insightful

    I still must disagree. Yes, the syntax is somewhat like C. However, WHAT you are coding is completely different. In particular, things that C and do with a simple "if" statement are not allowed at all in proper gate design. It is not hard to imagine a software guy coding latches all over the place, assigning the same signals from withing different always blocks, etc. Even "always @(posedge clock)" may be a fundamental paradigm shift for a software guy. And not to mention the rather arbitrary way that Verilog treats wire vs. reg.

    wire a = b & c;

    wire a;
    assign a = b & c;

    reg a
    always @(*) a = b & c;

    These three constructs do the same thing. Why is one "wire" and one "reg"?

    What is the difference between the two blocks (they are NOT the same - blocking vs. non-blocking)?

    always @(posedge clk) begin
        a = b;
        c = a & b;
    end

    always @(posedge clk) begin
        a = b;
        c = a & b;
    end

    What about race conditions? Glitches on combinatorial logic? Proper coding of state machines? Need memory? How do you drop in an encrypted 3rd party DDR controller and PHY? Interface with AHB bus? In a given process, how many levels are logic are reasonable for a given clock speed? What exactly are hold violations?

    I am not saying that any of these are insurmountable. What I am saying is that a good digital designer is worth paying for, and a software guy may have a very steep learning curve indeed.

    --
    "-1 Troll" is the apparently the same as "-1 I disagree with you."