Denial of Service via Algorithmic Complexity
dss902 writes "We (Department of Computer Science, Rice University) present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures... Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks."
I hate to ruin it for everyone, but the second link has the same content as the first, only in PDF format.
They claim to present a new method of low-bandwidth denial of service attack, but it looks like they're demonstrating quite an old, high-bandwidth, denial of service.
Tarsnap: Online backups for the truly paranoid
Several years ago Apache had a function that ran in O(n^2) time, and a HTTP GET request was capable of throwing that function into taking a very long time to process, making it easy to DoS an Apache httpd. The interesting part is that the same function could have been done in O(n) time, but wasnt. The Apache team fixed this by substituting the O(n) algorithm.
The One Rule Of Chess You'll Ever Need: Don't play someone who carries a kit in their bookbag.
you can use a modem to post a slashdot article with a link to the target computer...
...But in terms of raw power, nothing can match a slashdoting. Can anyone else read the link?
Speaking of, uh, denial of service...the site's quickly turning into a smoking ruin from where I'm standing. If it had been text only, it might have survived, but all the mathematical symbols are done using images (is that big O or big uh-O ;-) so the server is choking...
Bring on the MathML.
they don't list slahsdotting!
Do you even lift?
These aren't the 'roids you're looking for.
The project page is http://www.cs.rice.edu/~scrosby/hash/ and actually has details about individual vulnerable applications and test files for several systems. (And is kinder on the server for those who don't want to download the whole paper.)
Anyone got mirrors yet?
How am I supposed to fit a pithy, relevant quote into 120 characters?
This doesn't sit well with me. Should students at a University be studying, developing, and releasing improved methods with which to launch DOS attacks..?
It's a Mansierre!
Denial of Service via Algorithmic Complexity Attacks
Scott A. Crosby Dan S. Wallach
scrosby_at_cs.rice.edu dwallach*cs.rice.edu
Department of Computer Science, Rice University
Abstract:
We present a new class of low-bandwidth denial of service attacks that exploit algorithmic deficiencies in many common applications' data structures. Frequently used data structures have ``average-case'' expected running time that's far more efficient than the worst case. For example, both binary trees and hash tables can degenerate to linked lists with carefully chosen input. We show how an attacker can effectively compute such input, and we demonstrate attacks against the hash table implementations in two versions of Perl, the Squid web proxy, and the Bro intrusion detection system. Using bandwidth less than a typical dialup modem, we can bring a dedicated Bro server to its knees; after six minutes of carefully chosen packets, our Bro server was dropping as much as 71% of its traffic and consuming all of its CPU. We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks.
Introduction
When analyzing the running time of algorithms, a common technique is to differentiate best-case, common-case, and worst-cast performance. For example, an unbalanced binary tree will be expected to consume O(n log n) time to insert n elements, but if the elements happen to be sorted beforehand, then the tree would degenerate to a linked list, and it would take O(n^2) time to insert all n elements. Similarly, a hash table would be expected to consume O(n) time to insert n elements. However, if each element hashes to the same bucket, the hash table will also degenerate to a linked list, and it will take O(n^2) time to insert n elements.
While balanced tree algorithms, such as red-black trees [11], AVL trees [1], and treaps [17] can avoid predictable input which causes worst-case behavior, and universal hash functions [5] can be used to make hash functions that are not predictable by an attacker, many common applications use simpler algorithms. If an attacker can control and predict the inputs being used by these algorithms, then the attacker may be able to induce the worst-case execution time, effectively causing a denial-of-service (DoS) attack.
Such algorithmic DoS attacks have much in common with other low-bandwidth DoS attacks, such as stack smashing [2] or the ping-of-death 1, wherein a relatively short message causes an Internet server to crash or misbehave. While a variety of techniques can be used to address these DoS attacks, common industrial practice still allows bugs like these to appear in commercial products. However, unlike stack smashing, attacks that target poorly chosen algorithms can function even against code written in safe languages. One early example was discovered by Garfinkel [10], who described nested HTML tables that induced the browser to perform super-linear work to derive the table's on-screen layout. More recently, Stubblefield and Dean [8] described attacks against SSL servers, where a malicious web client can coerce a web server into performing expensive RSA decryption operations. They suggested the use of crypto puzzles [9] to force clients to perform more work before the server does its work. Provably requiring the client to consume CPU time may make sense for fundamentally expensive operations like RSA decryption, but it seems out of place when the expensive operation (e.g., HTML table layout) is only expensive because a poor algorithm was used in the system. Another recent paper [16] is a toolkit that allows programmers to inject sensors and actuators into a program. When a resource abuse is detected an appropriate action is taken.
This paper focuses on DoS attacks that may be mounted from across a network, targeting servers with the data that they might observe and store in a hash table as part of their normal operation. Section 2 details how hash table
Basically, the paper says this: If you have a hash table into which attackers can insert arbitrary keys, you'd better be using a hash function for which they cannot easily generate collisions.
I don't know if anyone has *published* this before, but it's certainly not new.
Tarsnap: Online backups for the truly paranoid
I agree that DoS attacks are really lame, but at least this is fairly complex and has some interesting technological implications. Most DoS attacks are just floods of packets, and can't really be defended against. This attack can be prevented, and maybe will help improve security in other ways. I'd rather see 10 people who can pull off a preventable algorithmic complexity attack than 10,000 who just packet a server to death. People will write tools, but they'll still remain too hard for moron DDoSer script kiddies to pull off.
Since you asked nicely..
This is the HTML version (with lots of images not mirrored, sorry) and this is the PDF version.
If the PDF starts hogging the pipe, don't be surprised if it gets symlinked to the HTML version.
...we can bring a dedicated _Bro_ server to its knees...
Why they always trin' to bring the black man down?
Karma: The shiznight, mostly because I am the Drizzle.
I understand your question, but it, IMO, is like asking if "the people" should be free to speak about the bad things their King does, and wouldn't that just be bad for the Kingdom?
The only 'bro' server I can think of is buddyhead.com ... Not much of an attack if you ask me ...
Reading the paper which begins with "We present a new class of ..." it sounds like these students discovered a new concept from nowhere.
Maybe their genius has been triggered by the recent advisory about a DOS exploiting hash collisions in netfilter.
They analyzed some softwares but no word about this known vulnerability. Still a good summary but not a discovery.
They exploit the difference between 'typical case' behavior versus worst-case behavior. For instance, in a hash table, the performance is usually O(1) for all operations. However in an adversarial environment, the attacker constructs carefully chosen input such that large number of 'hash collisions' occur. Suitable collisions can be computed even when the attacker is limited to as little as 48 or 32 bits.
These attacks can occur over a very wide gamut of software, with impacts ranging from devestating to innocious.
We have studied and found the following applications possibly vulnerable to a greater or lesser degree:
For the last two, we have a tentative attack file, but have not constructed a test program to confirm the attack.
We have constructed attacks and confirmed the degradation on these:
Also related is the recent linux 2.4.20 route cache attack by Florian Weimer. David Miller is working on a patch that fixes other similar issues in other places of the networking stack.
Our paper discusses a new class of denial of service attacks that work by exploiting the difference between average case performance and worst-case performance. In an adversarial environment, the data structures used by an application may be forced to experience their worst case performance. For instance, hash tables are usually thought of as being constant time operations, but with large numbers of collisions will degrade to a linked list and may lead to a 100-10,000 times performance degradation. Because of the widespread use of hash tables, the potential for attack is extremely widespread. Fortunately, in many cases, other limits on the system limit the impact of these attacks.
To be attackable, an application must have a deterministic or predictable hash function and accept untrusted input. In general, for the attack to be signifigant, the applications must be willing and able to accept hundreds to tens of thousands of 'attack inputs'. Because of that requirement, it is difficult to judge the impact of these attack without knowing the source code extremely well, and knowing all ways in which a program is used.
The solution for these attacks on hash tables is to make the hash function unpredictable via a technique known as universal hashing. Universal hashing is a keyed hash function where, based on the key, one of a large set hash functions is chosen. When benchmarking, we observe that for short or medium length inputs, it is comparable in performance to simple predictable hash functions such as the ones in Python or Perl.
I highly advise using a universal hashing library, either our own or someone elses. As is historically seen, it is very easy to make silly mistakes when attempting to implement your own 'secure' algorithm.
The abstract, paper, and a library implementing universal hashing is available at http://www.cs.rice.edu/~scrosby/hash/.
Scott
** Affected Products: Extremely widespread. Confirmed vulnerable applications include Perl, the Linux kernel, the Bro IDS, and the Squid HTTP proxy cache. Although unconfirmed, vulnerablities appear to be in the GLIB utility library, the OpenBSD kernel, DJBDNS cache, TCL, Python, and Mozilla.
We conjecture that many implementations of hash tables in both closed source and open source software, unless specifically designed to be immune, may be subject to attack. It is likely
Wasn't that some command-line prompt game in 1983 somewhere?
This space for rent.
http://www.icir.org/vern/bro.html
I skimmed the Project Page and aren't a couple of the examples awefully obvious?
The following one line of code brings every UNIX system I've run it on TO ITS KNEES WITHIN MINUTES!! This is a major vulnerability in EVERY UNIX system! Something must be done!
should not be considered a Denial of Service. If anything, call it what it truely is: shoddy programming. There is nothing being denied when it is the client that has the problem.
Another thing.. this is nothing new. A number of DoS attacks require the server to do more than the client requests. The attacker's complexity would be O(n) whereas the server would be O(n^2) or some such, where n is any given method of communication. The only "new" thing would perhaps be looking at individual programs and algorithms, which is probably applying a very liberal definition of "new." They even admit it themselves by claiming ping-of-death and stack smashing have much in common with what they describe in the article.
Dijkstra Considered Dead
Well, back in 1990 when I started in university computer science, this *was* one of the main things that a denial of service attack was considered to be.
DoS attacks were mainly either removal of a service (by crashing it and/or stoping it's reload) or resource starvation, being any of CPU, disk, memory, network, etc.
Good to see these people have bothered to flick through a bit of history probably over 15 years old by now and call it something new, yawn.
Having had enough background in large computer systems, it can become quite depressing watching the constant flow of 'new innovations' that have been done may many years before on big iron, but are seen as new by people who just don't have the experience to realise.
Of course it is easier to DoS a machine by using it's own functionality against it rather than just brute force, welcome to computer science 101.
Just wait six months and someone will be releasing a fantastic new defense against this by limiting the CPU resources of given tasks to defined amounts so they cannot stop the system and only that particular service.
I mean come on, this was a common undergrad trick on the vaxes we used to get to play with way back then.
In case you were wondering what Brois.
My brain is resistant to attacks using algorithmic complexity
There's been extensive discussion of this over on python-dev. Includes some interesting debunking by Python guru Tim Peters, plus another example of the same type of weakness involving backtracking regular expression engines.
But consider a commercial app where customers can send requests to a J2EE app server running within a JVM. That's a very popular, very common setup (JBoss, BEA Weblogic, IBM WebSphere, etc.). The JVM is a single process. It is not CPU-capped because it's designed to stay up and running. When a Java thread handles a request and bumps into a CPU-hogging attack, it is not going to be terminated by the J2EE app server.
So this is potentially a problem, because you currently do not have a CPU-capping parameter in the most popular J2EE app servers. A response to this kind of attacks would require monitoring the amount of CPU consumed by threads processing incoming requests, which is always delicate.
CPU-capping shouldn't be done lightly. It can lead to disastrous failures. For instance, I once tried to use a graphical web application rendering some do-it-yourself tee-shirt lettering. The application was running on an older IIS and apparently had a CPU-time cap, because I got a message "sorry, your request took too long to process" when my design became a bit involved. Needless to say, my business went to a competitor. So CPU-capping isn't even a sure-fire solution.
In summary: Sorry, it is an issue.
--
Mad science! Robots! Underwear! Cute girls! Full comic online! http://www.girlgeniusonline.com/
I found your paper very interesting. I'd like to address a couple things.
1. You point out that MD5 is vulnerable to a brute-force search for bucket collisions. Isn't any deterministic hash-function vulnerable to the same attack? I know you solved the problem using a keyed version of MD5. With Carter and Wegman, you alluded to a randomly chosen constant and vector. I didn't notice you addressing this issue with UMAC.
2. The abstract says, "We show how modern universal hashing techniques can yield performance comparable to commonplace hash functions while being provably secure against these attacks." The abstract is ambiguous, but it insinuates that you'll provide a proof. I didn't see one. Perhaps it was in the references? Even if it was, certainly your 20 bit Carter-Wegman construction merits a new proof.
3. You said, "When analyzing algorithmic complexity attacks, we must assume the attacker has access to the source code of the application," I disagree. The attacker doesn't need the source code. They can reverse engineer the compiled binary.
Also, I wonder if an attacker even needs the program. An attacker could reasonably guess that a Perl script will store a certain string in an associative array. Many websites automatically process their Apache logs with Perl. Instead of requesting real pages, a blackhat could request strings that will hash to the same bucket (assuming the site uses Perl). When cron starts processing the logs, the website could slow to a crawl. Granted, the attacker knows the hash function from Perl, but they don't have access to the website's custom-made script.
4. You have a spelling error. Your paper should read, "due to the ten times larger queues of unprocessed events".
>It's scary to me to think that people who write code in the first style get to submit code to important projects like Apache. It looks like a Basic programmer writing in C syntax. Even for a string with no double slashes it is far less efficient than the second style.
: ...
:
Actually this is one of the fundamental methodologies used in software development, dating back to the early Slow-Machineazoic period
In theory it goes like this
1. On paper, design the flow of data, what is hoped to be accomplished.
2. Prototype the functionality as a proof of concept, insure that the tools you are using to program the solution are sufficient to do it. Usually this code is pretty gnarly and uses O(n^2) or worse routines (cut and paste in a bubblesort, anybody?), just quick and dirty, screw memory leaks, etc
3. Test the functionality.
4. Using the prototype as a blueprint, rewrite the project using your most awesome coding techniques (replace the bubblesort with a quicksort, etc.)
5. Test the release candidate.
6. Ship it!
The (yes, it is documented) problem with this methodology is that it generally devolves to this
1. On paper, design the flow of data, what is hoped to be accomplished.
2. Prototype the functionality as a proof of concept, using routines that suck and are slow and leak memory like a fish net.
3. Test the functionality.
4. Somebody watching the calendar sees you are already over schedule.
5. Some bean counter over-rules your demands to rewrite the entire package, given that he watched it work.
6. Ship it!
Glonoinha the MebiByte Slayer