Denial of Service via Algorithmic Complexity
dss902 writes "We (Department of Computer Science, Rice University) present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures... Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks."
I hate to ruin it for everyone, but the second link has the same content as the first, only in PDF format.
The project page is http://www.cs.rice.edu/~scrosby/hash/ and actually has details about individual vulnerable applications and test files for several systems. (And is kinder on the server for those who don't want to download the whole paper.)
Denial of Service via Algorithmic Complexity Attacks
Scott A. Crosby Dan S. Wallach
scrosby_at_cs.rice.edu dwallach*cs.rice.edu
Department of Computer Science, Rice University
Abstract:
We present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures. Frequently used data structures have ``average-case'' expected running time that's far more efficient than the worst case. For example, both binary trees and hash tables can degenerate to linked lists with carefully chosen input. We show how an attacker can effectively compute such input, and we demonstrate attacks against the hash table implementations in two versions of Perl, the Squid web proxy, and the Bro intrusion detection system. Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks.
Introduction
When analyzing the running time of algorithms, a common technique is to differentiate best-case, common-case, and worst-cast performance. For example, an unbalanced binary tree will be expected to consume O(n log n) time to insert n elements, but if the elements happen to be sorted beforehand, then the tree would degenerate to a linked list, and it would take O(n^2) time to insert all n elements. Similarly, a hash table would be expected to consume O(n) time to insert n elements. However, if each element hashes to the same bucket, the hash table will also degenerate to a linked list, and it will take O(n^2) time to insert n elements.
While balanced tree algorithms, such as red-black trees [11], AVL trees [1], and treaps [17] can avoid predictable input which causes worst-case behavior, and universal hash functions [5] can be used to make hash functions that are not predictable by an attacker, many common applications use simpler algorithms. If an attacker can control and predict the inputs being used by these algorithms, then the attacker may be able to induce the worst-case execution time, effectively causing a denial-of-service (DoS) attack.
Such algorithmic DoS attacks have much in common with other low-bandwidth DoS attacks, such as stack smashing [2] or the ping-of-death 1, wherein a relatively short message causes an Internet server to crash or misbehave. While a variety of techniques can be used to address these DoS attacks, common industrial practice still allows bugs like these to appear in commercial products. However, unlike stack smashing, attacks that target poorly chosen algorithms can function even against code written in safe languages. One early example was discovered by Garfinkel [10], who described nested HTML tables that induced the browser to perform super-linear work to derive the table's on-screen layout. More recently, Stubblefield and Dean [8] described attacks against SSL servers, where a malicious web client can coerce a web server into performing expensive RSA decryption operations. They suggested the use of crypto puzzles [9] to force clients to perform more work before the server does its work. Provably requiring the client to consume CPU time may make sense for fundamentally expensive operations like RSA decryption, but it seems out of place when the expensive operation (e.g., HTML table layout) is only expensive because a poor algorithm was used in the system. Another recent paper [16] is a toolkit that allows programmers to inject sensors and actuators into a program. When a resource abuse is detected an appropriate action is taken.
This paper focuses on DoS attacks that may be mounted from across a network, targeting servers with the data that they might observe and store in a hash table as part of their normal operation. Section 2 details how hash table
Basically, the paper says this: If you have a hash table into which attackers can insert arbitrary keys, you'd better be using a hash function for which they cannot easily generate collisions.
I don't know if anyone has *published* this before, but it's certainly not new.
Tarsnap: Online backups for the truly paranoid
Since you asked nicely..
This is the HTML version (with lots of images not mirrored, sorry) and this is the PDF version.
If the PDF starts hogging the pipe, don't be surprised if it gets symlinked to the HTML version.
I also remember that problem with Apache. Here's the report and the code change involved for that particular bug.
EOF
Reading the paper which begins with "We present a new class of ..." it sounds like these students discovered a new concept from nowhere.
Maybe their genius has been triggered by the recent advisory about a DOS exploiting hash collisions in netfilter.
They analyzed some softwares but no word about this known vulnerability. Still a good summary but not a discovery.
They exploit the difference between 'typical case' behavior versus worst-case behavior. For instance, in a hash table, the performance is usually O(1) for all operations. However in an adversarial environment, the attacker constructs carefully chosen input such that large number of 'hash collisions' occur. Suitable collisions can be computed even when the attacker is limited to as little as 48 or 32 bits.
These attacks can occur over a very wide gamut of software, with impacts ranging from devestating to innocious.
We have studied and found the following applications possibly vulnerable to a greater or lesser degree:
For the last two, we have a tentative attack file, but have not constructed a test program to confirm the attack.
We have constructed attacks and confirmed the degradation on these:
Also related is the recent linux 2.4.20 route cache attack by Florian Weimer. David Miller is working on a patch that fixes other similar issues in other places of the networking stack.
Our paper discusses a new class of denial of service attacks that work by exploiting the difference between average case performance and worst-case performance. In an adversarial environment, the data structures used by an application may be forced to experience their worst case performance. For instance, hash tables are usually thought of as being constant time operations, but with large numbers of collisions will degrade to a linked list and may lead to a 100-10,000 times performance degradation. Because of the widespread use of hash tables, the potential for attack is extremely widespread. Fortunately, in many cases, other limits on the system limit the impact of these attacks.
To be attackable, an application must have a deterministic or predictable hash function and accept untrusted input. In general, for the attack to be signifigant, the applications must be willing and able to accept hundreds to tens of thousands of 'attack inputs'. Because of that requirement, it is difficult to judge the impact of these attack without knowing the source code extremely well, and knowing all ways in which a program is used.
The solution for these attacks on hash tables is to make the hash function unpredictable via a technique known as universal hashing. Universal hashing is a keyed hash function where, based on the key, one of a large set hash functions is chosen. When benchmarking, we observe that for short or medium length inputs, it is comparable in performance to simple predictable hash functions such as the ones in Python or Perl.
I highly advise using a universal hashing library, either our own or someone elses. As is historically seen, it is very easy to make silly mistakes when attempting to implement your own 'secure' algorithm.
The abstract, paper, and a library implementing universal hashing is available at http://www.cs.rice.edu/~scrosby/hash/.
Scott
** Affected Products: Extremely widespread. Confirmed vulnerable applications include Perl, the Linux kernel, the Bro IDS, and the Squid HTTP proxy cache. Although unconfirmed, vulnerablities appear to be in the GLIB utility library, the OpenBSD kernel, DJBDNS cache, TCL, Python, and Mozilla.
We conjecture that many implementations of hash tables in both closed source and open source software, unless specifically designed to be immune, may be subject to attack. It is likely
http://www.icir.org/vern/bro.html
Any half decent UNIX system (or should I say admin) would have process accounting enabled that would stop this kind of attack.
In case you were wondering what Brois.
There's been extensive discussion of this over on python-dev. Includes some interesting debunking by Python guru Tim Peters, plus another example of the same type of weakness involving backtracking regular expression engines.