Slashdot Mirror


RC4 Code Achieves 319 MB/s On AMD64 Opteron

Marc Bevand writes "This recent paper is about optimizing RC4 for AMD64 processors. A working implementation is provided. Its encryption/decryption throughput reaches 319 MB/s on a single AMD Opteron x44 processor running at 1.8 GHz. This makes it, as of today, the world's fastest RC4 symmetric cipher implementation for general purpose CPUs. As the author of this work, I would like to point out that many CPU-hungry applications have not been optimized for AMD64 yet. In other words: such speedups can be expected in other areas." An anonymous reader adds some figures for the old implementation: "Opteron 244 1.8 GHz (32-bit) 163 MB/s; Opteron 244 1.8 GHz (64-bit) 135 MB/s."

8 of 177 comments (clear)

  1. Optimisation is definately the key by datajack · · Score: 5, Informative

    I was initially disappointed with the performance of my Athlon64. CPU intensive 64bit code often seemed much slower than it's (heavily optimised) 32bit counterpart.

    Every now & then I come across some code optimised for 64bit processors, and it just flies - as more & more stuff gets the treatment, it will be like upgradingin for free :)

  2. Somewhat OT, but... by bhtooefr · · Score: 4, Informative

    If all a machine is doing is encrypting, A64s and Opterons are a bit overkill. The VIA C3 C5P has an encryption engine that makes top-of-the-line processors look sad. I couldn't find results for RC4, but is a page from a review of the EPIA MII-12000 which shows AES results. First graph is EPIAs in software, second is a few Intel and AMD CPUs (software), and the MII-12000 in software (which gets creamed by the AXP 2500+ and the P4@2.4) and hardware (which totally obliterates everything).

    1. Re:Somewhat OT, but... by mczak · · Score: 5, Informative
      AFAIK, the VIA's *only* do AES, as they're designed to make good VPN endpoints. This is cos some hefty AES subroutines are built into the hardware (with software drivers doing the rest).
      True. VIA padlock (as they call it) can currently only do AES in hardware (and it can also generate true random numbers). The next VIA chip called C7 (C5J Esther) however should be able to also do SHA-1, SHA-256 and parts of RSA in hardware (I think it should be available first half of 2005). That's of course still a limited set of encryption algorithms, but it's certainly an improvement.
  3. Not worth the outlay at present by cheezemonkhai · · Score: 4, Informative

    Don't get me wrong it's good that code is optimised, but I think that RC4 would fly faster on an IA64 than an opteron if specifically optimised to take advantage of the CPU's features.

    RC4 isn't really that relavent in real life as wep is crap & also easily done in hardware anyway.

    The 64 bit advantage will suffer thesame fate as the 32bit advantage did for the 486, pentium & especially the Pentium Pro.

    486 = 32bits, faster but people still bought 386's due to cost.

    Pentium = 32bits, sometimes faster but again costs meant 486's stayed popular.

    Pentium Pro = 32bit, 16 bit instrucations stalled it. WHen running pure 32bit code ran like the dogs, when running 16bit code (win 98) ran like a dog.

    Problem is that your generally better off saving your cash, buying a cheap CPU (32bit in this case) and waiting for the 2nd/3rd Generation CPU. By that time prices will more reasonable and you will see the full advantages as programs will use the extra bits properly.

    I mean come on MS still hasn't released a final AMD64 version of Winblows yet.

    1. Re:Not worth the outlay at present by joib · · Score: 3, Informative


      486 = 32bits, faster but people still bought 386's due to cost.


      The 386 was also a 32-bit processor...

    2. Re:Not worth the outlay at present by Bert64 · · Score: 4, Informative

      Actually, the majority of SSL websites are using RC4..
      If you use Mozilla and Apache, you can use 256-bit AES encryption for SSL (try loading up paypal with a mozilla based browser) but if either the server or client is microsoft-based your stuck with the much weaker 128bit RC4...
      MS - always behind the curve, no 256bit encryption, no 64bit os

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
  4. Re:until by RupW · · Score: 5, Informative

    Sorry it's not immediately obvious to me. Who are they?

    AFAICR AMD paid SuSE to do the original work. I think the main developers were Jan Hubicka, the current x86-64 maintainer, and Andreas Jaeger. SuSE have a few more well-known GCC contributors: look at MAINTAINERS.

  5. Re:PowerPC G5 by Anonymous Coward · · Score: 4, Informative
    What ouch? You're looking at something different; RC4 is not RC5-72...

    From distributed.net's pages, here's what it has to say on the Opterons for RC5-72 (uniprocessor)
    The Opteron 2420 achieved a score of 9,547,969.00.

    The 2GHz G5 for RC5-72 (uniprocessor) achieved a score of 15,057,412.00 (there are 2.5GHz chips available...) The best multi-cpu scores?
    A 2-way 2 GHz Opteron achieved a score of 15,145,274.67, but
    a 2-way 2.5GHz G5 smoked it with a score of 37,441,192.00.

    Apples to apples, my friend, apples to apples.