Slashdot Mirror


RC4 Code Achieves 319 MB/s On AMD64 Opteron

Marc Bevand writes "This recent paper is about optimizing RC4 for AMD64 processors. A working implementation is provided. Its encryption/decryption throughput reaches 319 MB/s on a single AMD Opteron x44 processor running at 1.8 GHz. This makes it, as of today, the world's fastest RC4 symmetric cipher implementation for general purpose CPUs. As the author of this work, I would like to point out that many CPU-hungry applications have not been optimized for AMD64 yet. In other words: such speedups can be expected in other areas." An anonymous reader adds some figures for the old implementation: "Opteron 244 1.8 GHz (32-bit) 163 MB/s; Opteron 244 1.8 GHz (64-bit) 135 MB/s."

9 of 177 comments (clear)

  1. until by iamnotacrook · · Score: 4, Insightful

    amd decides to provide a compiler for its chip, optimization will always be behind intel (who do. for linux also).

    1. Re:until by isometrick · · Score: 4, Insightful

      I agree, to an extent. It's been said that Intel's compiler can outdo GCC in some performance benchmarks.

      GCC is no slouch though, and obviously Intel is performing some tricks that could also be implemented by GCC.

      I think it'd be a great move for AMD to work WITH GNU to optimize 64-bit AMD code from GCC.

      Seems like Intel is more prone to keeping secrets when it comes to processors. Maybe this is (yet another) way for AMD to give them a run for their money.

  2. well... by mx.2000 · · Score: 4, Insightful

    "I would like to point out that many CPU-hungry applications have not been optimized for AMD64 yet. In other words: such speedups can be expected in other areas."

    well, maybe in some areas.
    Since this is a cipher, it obviously helps a lot when you can work on 64-Bit chunks of data instead of 32-Bit.

    The same speedup can probably be seen with applications that use numbers larger than 32b (or 64b for floats), since the number of operations necessary will essentially halve.

    But other than that, I don't see much room for huge speedups.

  3. Optimization First, Features Second by Space_Soldier · · Score: 4, Insightful

    I wish that every software company would put optimization first and features second. This way, we would not have to buy computers every few years. They can potentially last much longer.

    1. Re:Optimization First, Features Second by Smoo_Master · · Score: 2, Insightful

      I would tend to disagree with that. While one should weigh the performance against an overabundance of features, overzealous optimization can also result in problems. Remember that Knuth said "Premature optimization is the root of all evil."

      If someone took your idea to the extreme, you might get something like this:
      "What does it do?"
      "Nothing, but look how *fast* it does it?"

      I think the best solution is moderation in both ends.

    2. Re:Optimization First, Features Second by shplorb · · Score: 2, Insightful

      Buy a games console then =]

  4. Re:PowerPC G5 by fizze · · Score: 2, Insightful

    I dont know why everyone jumps off the horse as soon as they hear the magic word "assembly".
    Seriously.
    If you want to get 110% out of your hardware, you have to put effort in, to get effort out. Makes sense, doesnt it ?

    Im not saying people who dont like ASM are sissies, not at all. But Im saying that assembly has its right, just as so many other programing languages.

    --
    Powerful is he who overpowers his temptations.
  5. Re:PowerPC G5 by TiMac · · Score: 2, Insightful
    Indeed.

    But when other projects beckon that don't require assembler work, I'm not about to jump on one that does for "fun" either ;)

    --

  6. Re:Not worth the outlay at present by Anonymous Coward · · Score: 2, Insightful

    I'm not sure why you think that IA-64 would outperform AMD64. For those who don't know, IA-64 refers to Intel's VLIW instruction set that is used with the Itanium. RC4 generally is an integer type application, which the Opteron usually does better in (according to the SPEC results).

    Itanium does really well on encryption in general. Hand-optimized code makes good use of the large register set, the modulo-scheduling of loops and powerful bit manipulation primitives.

    IIRC Itanium hold the top stop in SpecSSL for a while (don't know where it stands currently, I don't think the numbers are current).

    In fact, the only time Itanium does well in anything is when it has 6MB of L2 cache

    Stop drinking the AMD coolaid.