Slashdot Mirror


Hunting Malware With GPUs and FPGAs (hackaday.com)

szczys writes: Rick Wesson has been working on a solution to identify the same piece of malware that has been altered through polymorphism (a common method of escaping detection). While the bits are scrambled from one example to the next, he has found that using a space filling curve makes it easy to cluster together polymorphically similar malware samples. Forming the fingerprint using these curves is computationally expensive. This is an Internet-scale problem which means he currently needs to inspect 300,000 new samples a day. Switching to a GPU to do the calculation proved four orders of magnitude efficiency over CPUs to reach about 200,000 samples a day. Rick has begun testing FPGA processing, aiming at a goal of processing 10 million samples in four hours using a machine drawing 4000 Watts.

44 comments

  1. Acronyms... by Anonymous Coward · · Score: 0, Flamebait

    WTF is a GPU?

    1. Re:Acronyms... by Anonymous Coward · · Score: 4, Informative

      Graphics Processing Unit.

      It's more or less a CPU with more cores and less functionality per core. There are typically a few instructions you would otherwise expect form a DSP like saturated addition.

    2. Re: Acronyms... by WarJolt · · Score: 1

      That's an extreme oversimplification.

      If you assume you have 1000 threads of execution, you could execute each one independently on a 1000 core machine. This is not true on a GPU. Those threads on a GPU will be grouped together. Each thread in a group will be executing the same instructions, so you can't have each thread executing done independent code.

      Conventional CPUs can handle any permutation of branch. In the GPU if you have an "if-else" condition and some threads in a group do the "if" and others do the "else" you have to wait for the if case first then execute the else case.

    3. Re:Acronyms... by Anonymous Coward · · Score: 0

      why are you on this site ?

    4. Re:Acronyms... by KGIII · · Score: 1

      Eternal September.

      --
      "So long and thanks for all the fish."
    5. Re:Acronyms... by thegarbz · · Score: 1

      WTF is a GPU?

      An indication that you stumbled onto the wrong site, read the wrong article, and then proceeded to comment just to figure out where on the internet you ended up.

      Here let me direct you back to mainstream media

  2. 4KW by sims+2 · · Score: 2

    Wow how are you powering this thing a dryer plug?
    Multiple PSUs?

    That's a heck of a lot of power for a single machine.

    --
    Minimum threshold fixed. Thanks!
    1. Re:4KW by sims+2 · · Score: 1

      Then again by power cost he may have meant 4KWH which would be what you would expect from a machine using 1KW for 4 hours.

      That's probably what happened but It's not nearly as interesting.

      --
      Minimum threshold fixed. Thanks!
    2. Re:4KW by Pharmboy · · Score: 1

      At 240VAC, that is less than 17 amps, so a dedicated 20 amp circuit with 12 guage wire would do it (NEMA 6-20). No bigger than a standard computer plug. Still, that is a shitload of power.

      --
      Tequila: It's not just for breakfast anymore!
    3. Re:4KW by gstoddart · · Score: 1

      Throw enough high-draw things like fancy graphics cards into a cluster or a rack, and it doesn't seem all that difficult.

      A quick google says power hungry GPUs are the norm.

      With great power comes great power bills.

      --
      Lost at C:>. Found at C.
    4. Re:4KW by dmbasso · · Score: 1

      On the bright side (and by that I don't mean the IR it emits) it may make coffee as a side product!

      --
      `echo $[0x853204FA81]|tr 0-9 ionbsdeaml`@gmail.com
    5. Re:4KW by DigiShaman · · Score: 1

      I'm guessing he's powering the computers in a rack hosted in a colo?

      --
      Life is not for the lazy.
    6. Re:4KW by Chris+Mattern · · Score: 1

      And then, of course, there's the flipside of power consumption: heat production. He's not only got to provide the current, he's going to have to provide cooling for this little bonfire he's contructed.

    7. Re:4KW by yodleboy · · Score: 1

      whoa there! The article you quoted is misleading. First off, it lists Recommended Power Supplies. This is NOT the same as the power draw by the GPU. This is the manufacturers recommendation of what you need to ensure stable performance of EVERYTHING in the PC with that card installed. The higher end the card, the greater the recommended minimum, partly to compensate for increased GPU needs, but also because the kind of people that run these cards are likely to have a crap load of other stuff that needs feeding as well. Who runs these cards? Gamers. Anyone that does serious PC gaming uses Steam for at least some games. Guess what? Steam does an opt-in hardware survey on a regular basis. While I'm sure they keep it anonymous, I'm also sure they sell that useful data to companies like NVIDIA and AMD. They KNOW what kind of total system power draw the buyers of a particular card have and can certainly extrapolate to a new card.

      Anyway, even a screamer like the NVIDIA GTX Titan X is drawing less than 250 watts at max load (the SLI entries on your link are running multiple linked cards). Here's a power consumption test by Tom's Hardware http://www.tomshardware.com/re....
      GPU's need a lot of power relative to other components, but they are in fact extremely efficient devices. Every new batch of cards that comes out manages to outperform the last AND use less power doing it.

    8. Re:4KW by gstoddart · · Score: 1

      Sorry, I wasn't clear in what I was citing:

      Present graphics cards with a power consumption over 75 Watts include a combination of six-pin (75W) or eight-pin (150W)

      At 150W, that's 26 cards. At your 250W, that's 16 cards.

      4000W isn't that hard to reach ... put it into a case with all of the other things, and you're talking, what, 3-4 machines?

      So, yeah, they're not 1000W cards, but if you're talking about combining a couple plus the rest of the overhead it's not that hard to get there.

      --
      Lost at C:>. Found at C.
    9. Re:4KW by ArylAkamov · · Score: 1

      Best part about still owning an older CRT monitor, aside from quality and refresh rate:

      It makes a great space heater!

    10. Re:4KW by godel_56 · · Score: 1

      whoa there! The article you quoted is misleading. First off, it lists Recommended Power Supplies. This is NOT the same as the power draw by the GPU. This is the manufacturers recommendation of what you need to ensure stable performance of EVERYTHING in the PC with that card installed. The higher end the card, the greater the recommended minimum, partly to compensate for increased GPU needs, but also because the kind of people that run these cards are likely to have a crap load of other stuff that needs feeding as well.

      Actually the high power recommendations are to cope with the clueless noobs who buy white box PSUs, which can barely supply 50% of their rated current for an extended period without catching fire. Oh, and maybe also to allow some small headroom for later system expansion.

    11. Re:4KW by Anonymous Coward · · Score: 0

      That's still possible with COTS GPGPU servers. Since the article mentioned Xeon Phi's, I checked Intel's server partner SuperMicro. They have a 5U, 8 CPU server with 10 full-size PCI-e slots and 4x2800 W PSU (2+2 redundant, so 5600W nominal). You do need 4 outlets for this, of course. You will want to wire them to separate breaker. In theory you could share a breaker between an active and a standby PSU, but that assumes the failover is smooth.

    12. Re:4KW by KGIII · · Score: 1

      It takes two wind turbines and about 7200ft^2 but I can run a *very* large house, complete with a server room and network closet.

      It is not cost effective so you still have a point. Technically, I push power into the grid and get credits. I can use, save, sell, or trade those credits. Once I've gone a whole year with my current configuration - I'll be donating them to a local elementary school. They're evil bastards who somehow con me into shit. They send me cards, Valentines' Day cards, and make me awful cookies. I also have to go listen to them sing, watch them act, and hear their "music." If I hear Ode To Joy one more time....

      Actually, I sat in with the little kiddies and did Ode To Joy and Silent Night on an acoustic guitar with only a little bit of mic involved. I pretend I hate it but, being retired, it's something I've come to enjoy.

      Now to get rid of pneumonia, I just got diagnosed today and it explains where I've been all week, for the most part. Even typing this is damned annoying.

      --
      "So long and thanks for all the fish."
  3. A Business Oppertunity by BradleyUffner · · Score: 1

    All the malware authors could make some easy money selling him some processing time from all the botnets they run.

  4. Is hackaday the new click purchaser? by Anonymous Coward · · Score: 0

    Awful lot of links to hackaday lately, including that stupid one that basically said "good tools are better than bad ones".

    1. Re: Is hackaday the new click purchaser? by Anonymous Coward · · Score: 0

      Can you blame them for writing that? From their perspective, /. editors are good tools.

  5. For research, this seems invaluable by chispito · · Score: 1

    For research, this seems invaluable. I'm sure it will help a lot in profiling real attackers right now.

    As an effective deterrent, I cannot imagine this will be viable long-term. It seems to me that it is much easier for the attackers to generate more permutations than it is for the defenders to identify them. Will clients be able to keep up with matching against that many definitions? Maybe you only scan on particular servers, and because of the CPU intensive nature, you sell it as a service. Well, guess what? Everything would have to be decrypted to be identified, so then there are privacy concerns.

    In the worst case scenario, your attackers run their malware through FPGAs to send a unique permutation to each victim.

    --
    The Daddy casts sleep on the Baby. The Baby resists!
    1. Re:For research, this seems invaluable by gstoddart · · Score: 1

      Well, it's really interesting ... from the limited stuff in the article, it's essentially calculus (I think).

      Sure, you can do a lot of permutations, but you can only do so many of them which are fundamentally different. Because they still share some underlying similarity with the original.

      As I understood it, imagine a wavy line through space. Variations of the same thing will follow that wavy line +- some space around that line for the permutations. Close up the permutations look really different, but as you pull back, the space around the wavy line collapses into just a thicker line.

      I you came from the same original source, you can only be so much different from that. And very different origins will create different wavy lines through space.

      So, even a unique permutation could be detected as "well, ultimately it's just one of these, otherwise it wouldn't match so closely".

      Of course, calculus was a long time ago, and this isn't my field so I might have missed the mark ... but it seems like your unique permutation could fall into a category which is more readily recognized as being in a known category.

      If you're within a certain margin of that wavy line through space, you're just a permutation.

      --
      Lost at C:>. Found at C.
    2. Re:For research, this seems invaluable by Anonymous Coward · · Score: 0

      If this method is implemented, then the terrorists^Wmalware authors have won.

    3. Re:For research, this seems invaluable by chispito · · Score: 1

      All I really meant is that it is nearly always easier for the attackers to adapt than the defenders.

      --
      The Daddy casts sleep on the Baby. The Baby resists!
    4. Re:For research, this seems invaluable by gstoddart · · Score: 1

      Sure, but taken far enough this solution would mean the attackers would need to write a whole new thing.

      Once you have this, you check something new, identify it as a match of the thing, and add it.

      The attackers can always be more nimble, but if a solution which can adapt and say "oh, that's just a variation of this, I'll block it" then you can at least ratchet up the arms race.

      --
      Lost at C:>. Found at C.
    5. Re:For research, this seems invaluable by SethJohnson · · Score: 1
      Like you, I'm outside my discipline trying to comment here....

      Sure, but taken far enough this solution would mean the attackers would need to write a whole new thing.

      I'm thinking one really big wrench that can be thrown into algorithmic detection is if a randomly selected salt is used in each permutation of the malware. That could force this type of analysis to require dramatically larger resources with little architectural investment on the part of the malware creators.

    6. Re:For research, this seems invaluable by Capt.Albatross · · Score: 1

      Information on what Wesson is calculating seems hard to come by, but this may be it:
      https://www.google.ch/patents/...

  6. Something doesn't add up again by wwalker · · Score: 2

    Switching to a GPU to do the calculation proved four orders of magnitude efficiency over CPUs to reach about 200,000 samples a day.

    4 orders of magnitude?! Was he processing 20 samples a day before? What kind of CPU was he using? 8088?

    1. Re:Something doesn't add up again by Anonymous Coward · · Score: 1

      4 orders of magnitude in efficiency, not throughput. Currently he's processing 200.000 samples per day, using 4 kW. He might have processed 20 samples a day using the same 4kW, or 2000 samples a day using 400 kW. The latter may sound like a lot, but renting such peak processing power is the whole point of the cloud.

  7. 4 orders of magnitude? by godrik · · Score: 1

    Really, 4 orders of magnitude? 10000 times faster with GPUs than CPUs? I call bullshit. You might get a factor of 100 if you pick a SoA GPU and a shitty CPU. But comparing things of similar generation, you will not get a factor of 100 on modern hardware. So either they are not in base 10, or there is BS going on.

    1. Re:4 orders of magnitude? by Anonymous Coward · · Score: 0

      Really, 4 orders of magnitude? 10000 times faster with GPUs than CPUs? I call bullshit. You might get a factor of 100 if you pick a SoA GPU and a shitty CPU. But comparing things of similar generation, you will not get a factor of 100 on modern hardware. So either they are not in base 10, or there is BS going on.

      This is /. If you call BS, you have to provide some technical justification.

      A GPU is easily four order of magnitude more efficient at doing 3D graphics calculations than a conventional CPU, thanks to its specialized architecture. As clustering algorithms use floating point math, it's entirely believable that similar speed ups are possible.

    2. Re:4 orders of magnitude? by Arkh89 · · Score: 1

      No it isn't. Rule of thumb : 1 high end GPU = 12x 4 cores in single precision, 8x 4 cores in double.
      People getting more than two orders of magnitude differences are comparing (highly-)unoptimized code.
      Optimizing code on CPU or GPU is hard.

    3. Re:4 orders of magnitude? by Anonymous Coward · · Score: 0

      He claims 9 order of magnitude with Phi which is even more surprising since the threading model is much closer to a CPU than to a GPU.

    4. Re:4 orders of magnitude? by godrik · · Score: 1

      So a gpu like titanx get you about 2Tflop/s and 350GB/s of memory bandwidth. A modern core i7 with 8 cores gets you about 100Gflop/s and about 50GB/s of memory bandwidth. If you are looking at integer ops you get similar ratios.

      So assuming you can saturate both architecture, you should see a difference of roughtly a factor of 10. If your application saturates one architecture and not the other one, I could buy an other factor of 10 with a bit of arguing. But to have an other factor of 100, you need to do something completely ridiculous.

    5. Re:4 orders of magnitude? by godrik · · Score: 1

      Phi is a strange architecture and people aren't used to it. but 9 orders of magnitude is a bit ridiculous, really 1 billion time faster... Do you have the actual report?

    6. Re:4 orders of magnitude? by Anonymous Coward · · Score: 0

      That's what is said in the linked article and it's early testing so it couldn't probably be found on any formal article report yet.

  8. 4000 watts by fustakrakich · · Score: 1

    I hope he does something useful with the heat. And now we're giving the electric company some incentive to make viruses. If all this detective work generates so much revenue, well, as Kennedy said, "why not?"

    --
    “He’s not deformed, he’s just drunk!”
  9. Why hunt malware? by Anonymous Coward · · Score: 0

    Use systems that aren't vulnerable instead. (I.e. anything not from microsoft). Malware is not much of a problem, except on windows. And there are quite a few alternatives to windows these days.

    1. Re:Why hunt malware? by Anonymous Coward · · Score: 1

      The only reason Windows has more malware is it is far more used (even on servers it's a 50/50 split out there to this very day vs. Linux). It's a greater return on investment in botnet creation alone on those grounds since a botnet's more effective with greater numbers of enslaved nodes to call on for say, a DDoS attack. You're going to get that using Windows which commands a good 94% of the pc market on pc desktops against Linux or MacOS X. That's the only real reason. Not that Windows is less securable (and windows is easily security hardened against that as are even *NIX variants I noted. None of them are as secured as possible in default configurations by default out of the box)

  10. Cloudsource the computer power required? by Tijaska · · Score: 1

    Finding malware benefits most computer users. Could this search be spread over large numbers of computers across the Internet? Computer owners could volunteer spare machines cycles to aid the search.

  11. Hunting in GPUs and FPGAs? by Anonymous Coward · · Score: 0

    When I first read this in my head it was:

    "Hunting Malware In GPUs and FPGAs"

    I was imagining a team of engineers scouring datasheets and reverse engineering register interfaces and blocks of logic.

    Reality is so much more... rudimentary.