Slashdot Mirror


Denial of Service via Algorithmic Complexity

dss902 writes "We (Department of Computer Science, Rice University) present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures... Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks."

17 of 257 comments (clear)

  1. Same Content / Two Links by Zach+Garner · · Score: 5, Informative

    I hate to ruin it for everyone, but the second link has the same content as the first, only in PDF format.

  2. The project page (the one with the details) by Convergence · · Score: 5, Informative

    The project page is http://www.cs.rice.edu/~scrosby/hash/ and actually has details about individual vulnerable applications and test files for several systems. (And is kinder on the server for those who don't want to download the whole paper.)

  3. Site is DoSed -- Abstract & Intro by Anonymous Coward · · Score: 4, Informative

    Denial of Service via Algorithmic Complexity Attacks

    Scott A. Crosby Dan S. Wallach
    scrosby_at_cs.rice.edu dwallach*cs.rice.edu

    Department of Computer Science, Rice University

    Abstract:
    We present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures. Frequently used data structures have ``average-case'' expected running time that's far more efficient than the worst case. For example, both binary trees and hash tables can degenerate to linked lists with carefully chosen input. We show how an attacker can effectively compute such input, and we demonstrate attacks against the hash table implementations in two versions of Perl, the Squid web proxy, and the Bro intrusion detection system. Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks.

    Introduction
    When analyzing the running time of algorithms, a common technique is to differentiate best-case, common-case, and worst-cast performance. For example, an unbalanced binary tree will be expected to consume O(n log n) time to insert n elements, but if the elements happen to be sorted beforehand, then the tree would degenerate to a linked list, and it would take O(n^2) time to insert all n elements. Similarly, a hash table would be expected to consume O(n) time to insert n elements. However, if each element hashes to the same bucket, the hash table will also degenerate to a linked list, and it will take O(n^2) time to insert n elements.

    While balanced tree algorithms, such as red-black trees [11], AVL trees [1], and treaps [17] can avoid predictable input which causes worst-case behavior, and universal hash functions [5] can be used to make hash functions that are not predictable by an attacker, many common applications use simpler algorithms. If an attacker can control and predict the inputs being used by these algorithms, then the attacker may be able to induce the worst-case execution time, effectively causing a denial-of-service (DoS) attack.

    Such algorithmic DoS attacks have much in common with other low-bandwidth DoS attacks, such as stack smashing [2] or the ping-of-death 1, wherein a relatively short message causes an Internet server to crash or misbehave. While a variety of techniques can be used to address these DoS attacks, common industrial practice still allows bugs like these to appear in commercial products. However, unlike stack smashing, attacks that target poorly chosen algorithms can function even against code written in safe languages. One early example was discovered by Garfinkel [10], who described nested HTML tables that induced the browser to perform super-linear work to derive the table's on-screen layout. More recently, Stubblefield and Dean [8] described attacks against SSL servers, where a malicious web client can coerce a web server into performing expensive RSA decryption operations. They suggested the use of crypto puzzles [9] to force clients to perform more work before the server does its work. Provably requiring the client to consume CPU time may make sense for fundamentally expensive operations like RSA decryption, but it seems out of place when the expensive operation (e.g., HTML table layout) is only expensive because a poor algorithm was used in the system. Another recent paper [16] is a toolkit that allows programmers to inject sensors and actuators into a program. When a resource abuse is detected an appropriate action is taken.

    This paper focuses on DoS attacks that may be mounted from across a network, targeting servers with the data that they might observe and store in a hash table as part of their normal operation. Section 2 details how hash table

  4. I hope this isn't news to anyone... by cperciva · · Score: 5, Informative

    Basically, the paper says this: If you have a hash table into which attackers can insert arbitrary keys, you'd better be using a hash function for which they cannot easily generate collisions.

    I don't know if anyone has *published* this before, but it's certainly not new.

    1. Re:I hope this isn't news to anyone... by miu · · Score: 2, Informative
      Does this still work on hashtables where each node consists of a linked list? (as opposed to ones where if the node is full, it does on to check the next node)

      Yep, both chained and open-addressed hash can degenerate to a linked list. A chained hash (what you describe as a node consisting of a linked list) is actually a much more common hash type than open-addressing (where the values are stored in the table). There are a couple reasons, a chained hash can overflow, and an open ended hash always has to check the value, not just during collision resolution (many implementations of chained hashes do the same).

      --

      [Set Cain on fire and steal his lute.]
    2. Re:I hope this isn't news to anyone... by cperciva · · Score: 2, Informative

      I'm interested by what the authors say about 'modern universal hash functions' and how they are immune from attack. Does this mean you pick a random string and XOR against it before computing each hash value, or something like that? It would seem there could still be some way to discover the exact hash function being used by means of timing attacks (you can tell when an input has caused a hash collision because the response time might be slightly slower).

      If you use a strong keyed hash function (eg, keyed MD5) then there is no known method to recover the hash key with a chosen plaintext attack. In other words, even if you know exactly what each of your inputs hashes to, you still won't be able to predict the hash of anything else (or, consequently, create collisions).

      Incidentally, hash functions are only O(1) average case under bogus assumptions. Random access to an n-bit address space takes O(log n) time.

    3. Re:I hope this isn't news to anyone... by cperciva · · Score: 2, Informative

      And any 2nd year Computer Science student who stayed awake during his lectures knows that, given a source of random bits, the probability of quicksort taking more than O(n log n) time is zero for any input.

  5. Mirror mirror on the wall by Adam9 · · Score: 5, Informative

    Since you asked nicely..

    This is the HTML version (with lots of images not mirrored, sorry) and this is the PDF version.

    If the PDF starts hogging the pipe, don't be surprised if it gets symlinked to the HTML version.

  6. Re:I've seen this before by weezel · · Score: 5, Informative

    I also remember that problem with Apache. Here's the report and the code change involved for that particular bug.

    --
    EOF
  7. Credits by alain1234 · · Score: 5, Informative

    Reading the paper which begins with "We present a new class of ..." it sounds like these students discovered a new concept from nowhere.
    Maybe their genius has been triggered by the recent advisory about a DOS exploiting hash collisions in netfilter.
    They analyzed some softwares but no word about this known vulnerability. Still a good summary but not a discovery.

    1. Re:Credits by Anonymous Coward · · Score: 5, Informative

      We cite Florian's attack in the paper, predate it by several months. And, had you read the actual advisory, you'd already have seen our paper, we're on his highly reccomended reading list on his advisory for the last 2 weeks.

      Scott

    2. Re:Credits by The+Madpostal+Worker · · Score: 5, Informative
      If you scroll down to the bottom of the advisory you'll see a link to orginal advisory. Here's the begining of the message:
      List: linux-kernel
      Subject: Route cache performance under stress
      From: Florian Weimer <fw () deneb ! enyo ! de>
      Date: 2003-04-05 16:37:43
      Please read the following paper:

      <http://www.cs.rice.edu/~scrosby/tr/HashAttack.pdf >

      Then look at the 2.4 route cache implementation.

      Short summary: It is possible to freeze machines with 1 GB of RAM and more with a stream of 400 packets per second with carefully chosen source addresses. Not good.
      It seems the advisory stems from the paper, not the other way around.
      --

      /*
      *Not a Sermon, Just a Thought
      */
  8. Bugtraq Post by Anonymous Coward · · Score: 5, Informative
    Hello. This is to announce a new class of attack which we have named 'Algorithmic Complexity Attack'. These attacks can perform denial of service and/or cause the victim to consume more CPU time than expected. We have a website for our research paper and project and tentative source code illustrating the solution, universal hashing, available at http://www.cs.rice.edu/~scrosby/hash/

    They exploit the difference between 'typical case' behavior versus worst-case behavior. For instance, in a hash table, the performance is usually O(1) for all operations. However in an adversarial environment, the attacker constructs carefully chosen input such that large number of 'hash collisions' occur. Suitable collisions can be computed even when the attacker is limited to as little as 48 or 32 bits.

    These attacks can occur over a very wide gamut of software, with impacts ranging from devestating to innocious.

    We have studied and found the following applications possibly vulnerable to a greater or lesser degree:
    • Mozilla 1.3.1
    • DJBDNS 1.05
    • TCL 8.4.3
    • GLIB 2.2.1
    • Python 2.3b1

    For the last two, we have a tentative attack file, but have not constructed a test program to confirm the attack.

    We have constructed attacks and confirmed the degradation on these:

    • Perl 5.6.1
    • Perl 5.8.0
    • Linux 2.4.20 directory cache (dcache)
    • Squid 2.5STABLE1
    • Bro IDS 0.8a20

    Also related is the recent linux 2.4.20 route cache attack by Florian Weimer. David Miller is working on a patch that fixes other similar issues in other places of the networking stack.

    Our paper discusses a new class of denial of service attacks that work by exploiting the difference between average case performance and worst-case performance. In an adversarial environment, the data structures used by an application may be forced to experience their worst case performance. For instance, hash tables are usually thought of as being constant time operations, but with large numbers of collisions will degrade to a linked list and may lead to a 100-10,000 times performance degradation. Because of the widespread use of hash tables, the potential for attack is extremely widespread. Fortunately, in many cases, other limits on the system limit the impact of these attacks.

    To be attackable, an application must have a deterministic or predictable hash function and accept untrusted input. In general, for the attack to be signifigant, the applications must be willing and able to accept hundreds to tens of thousands of 'attack inputs'. Because of that requirement, it is difficult to judge the impact of these attack without knowing the source code extremely well, and knowing all ways in which a program is used.

    The solution for these attacks on hash tables is to make the hash function unpredictable via a technique known as universal hashing. Universal hashing is a keyed hash function where, based on the key, one of a large set hash functions is chosen. When benchmarking, we observe that for short or medium length inputs, it is comparable in performance to simple predictable hash functions such as the ones in Python or Perl.

    I highly advise using a universal hashing library, either our own or someone elses. As is historically seen, it is very easy to make silly mistakes when attempting to implement your own 'secure' algorithm.

    The abstract, paper, and a library implementing universal hashing is available at http://www.cs.rice.edu/~scrosby/hash/.

    Scott

    ** Affected Products: Extremely widespread. Confirmed vulnerable applications include Perl, the Linux kernel, the Bro IDS, and the Squid HTTP proxy cache. Although unconfirmed, vulnerablities appear to be in the GLIB utility library, the OpenBSD kernel, DJBDNS cache, TCL, Python, and Mozilla.

    We conjecture that many implementations of hash tables in both closed source and open source software, unless specifically designed to be immune, may be subject to attack. It is likely

  9. Re:What is a "Bro server"? by Anonymous Coward · · Score: 3, Informative
  10. Re:glib example by msgmonkey · · Score: 4, Informative

    Any half decent UNIX system (or should I say admin) would have process accounting enabled that would stop this kind of attack.

  11. Bro IDS info by Fzz · · Score: 2, Informative

    In case you were wondering what Brois.

  12. Discussion on python-dev by thornist · · Score: 4, Informative

    There's been extensive discussion of this over on python-dev. Includes some interesting debunking by Python guru Tim Peters, plus another example of the same type of weakness involving backtracking regular expression engines.