Microsoft Fuzzing Botnet Finds 1,800 Office Bugs

← Back to Stories (view on slashdot.org)

Microsoft Fuzzing Botnet Finds 1,800 Office Bugs

Posted by timothy on Thursday April 1, 2010 @09:09PM from the running-through-the-possibilities dept.

CWmike writes "Microsoft uncovered more than 1,800 bugs in Office 2010 by tapping into the unused computing horsepower of idling PCs, a company security engineer said on Wednesday. Office developers found the bugs by running millions of 'fuzzing' tests, a practice employed by both software developers and security researchers, that searches for flaws by inserting data into file format parsers to see where programs fail by crashing. 'We found and fixed about 1,800 bugs in Office 2010's code,' said Tom Gallagher, senior security test lead with Microsoft's Trustworthy Computing group, who last week co-hosted a presentation on Microsoft's fuzzing efforts at the CanSecWest security conference. 'While a large number, it's important to note that that doesn't mean we found 1,800 security issues. We also want to fix things that are not security concerns.'"

41 of 111 comments (clear)

Min score:

Reason:

Sort:

xkydgtufhlofhil by Anonymous Coward · 2010-04-01 21:13 · Score: 5, Funny

ghulkgiplgbvihlnk luioguilgil.bjohj110-o; Huto;bn
1. Re:xkydgtufhlofhil by troll8901 · 2010-04-01 22:35 · Score: 3, Interesting
  
  ghulkgiplgbvihlnk luioguilgil.bjohj110-o; Huto;bn
  I don't understand this Score:4 Insightful comment. Can someone explain?
2. Re:xkydgtufhlofhil by sucker_muts · 2010-04-01 22:40 · Score: 5, Informative
  
  don't understand this Score:4 Insightful comment. Can someone explain?
  Even though your name does look quite suspicious, I'll try to explain anyway.
  
  The parent is showing how fuzzing works:
  Using random 'data' to test the various functions of software, so we can find out if a certain piece of input triggers undesirable behavior.
  
  In this case you could say that he's not only giving an example, but is testing the slashdot user comments code as well.
  
  But it's perhaps more an attempt at humor. :-)
  
  --
  Dependency hell? => /bin/there/done/that
3. Re:xkydgtufhlofhil by msclrhd · 2010-04-01 22:41 · Score: 2, Informative
  
  Fuzzing is a technique where you modify the data sent to a file, protocol or data parser (e.g. code that reads an xml file) by changing random bits. Thus, if you have a 'text' command, a fuzzer could change that to 'next', or if you have a quoted striing "test", the fuzzer could change the end quote to something else, e.g. ' "tests '.
  Hence, what you can end up with is something that looks like random garbage.
4. Re:xkydgtufhlofhil by Anonymous Coward · 2010-04-02 00:03 · Score: 2, Funny
  
  Windows: "It's not a bug, it's a feature."
  GNOME: "It's not a bug, it's a design decision."
5. Re:xkydgtufhlofhil by jonadab · 2010-04-02 01:01 · Score: 5, Informative
  
  Except that, in most cases, random letters in the ranges a-z and A-Z are not where you're going to find most of your problems. The major sources of bugs that can be uncovered by random data are assumptions that programmers (sometimes subconsciously) make about what the data are going to be like.
  
  The most obvious of these are assumptions like "a newline can't occur in a single-line field" (a mistake web developers often make, because they assume the data are coming from an HTML input element that only allows single-line data; but an attacker can in fact send anything they want in an http request), or "nobody's going to have a single-quote character in their name" (hello, SQL injection attack). This sort of thing is probably not a major factor in Office, because it's common for documents to have those kinds of characters in them. There might be a couple of weird old control characters (like the ASCII NUL, 000), but those bugs were probably found aeons ago.
  
  A second major category of problematic assumptions assumptions has to do with languages and code pages and character sets. When software that was written to assume a particular character set (like ASCII, or Latin-1) or even just one code page at a time (like, whichever one is the system default) has to be extended to support more (like, especially, Unicode), you run into all kinds of nasties. Again, though, Office probably had to deal with these issues a couple of versions ago. They may have found a few more, but at this point it's probably not the most fertile ground any more.
  
  When you're dealing with file formats, however, there are also things like "the value at offset 0x003C from the beginning of the object header contains the size of the object, which can never be more than 0xFFFF" and "an object can embed another object by referencing it, but there are never any circular references, because the software doesn't allow the user to put an object inside itself". These sorts of assumptions pop up every time you write or change code that reads a file format, so they never go away really. This sort of thing is probably *most* of what the Office team found, I suspect.
  
  --
  Cut that out, or I will ship you to Norilsk in a box.
6. Re:xkydgtufhlofhil by elronxenu · 2010-04-02 01:52 · Score: 3, Insightful
  
  Linux: "It's not a bug, not any more."
7. Re:xkydgtufhlofhil by mobby_6kl · 2010-04-02 03:53 · Score: 2, Funny
  
  >In this case you could say that he's not only giving an example, but is testing the slashdot user comments code as well.
  It's testing not just the user comments code, but also the moderation system code and the moderators themselves. In this case, it appears that he found a bug which causes the comment to be moderated Insightful by providing a certain combination of random characters as input. I will now attempt to replicate this problem.
  ______TEST DATA FOLLOWS______
  TvaHokVAwgZGLrzPnDsIzHnKwuOOQEgaFskFJx-9JH@eIbwWSYhujyXDekeBP-9YQlfiZtdOZXlupfvy
  UYXenTsWzzF#SScvbvWXtMMcbMg@xIsRC!OiViEDnt-9fQRGXEgvbfdlBATolRyiVYmcKyHi-9bLVcYx
  JrPmw
8. Re:xkydgtufhlofhil by Helen+O'Boyle · 2010-04-02 03:55 · Score: 3, Informative
  
  "nobody's going to have a single-quote character in their name" (hello, SQL injection attack)
  Hey, I resemble that remark! And yes, it's resulted in chuckles over the years. Microsoft, DevelopMentor, random e-commerce sites... many have fallen to the Irish. When talking to security professionals, I introduce myself as "the woman whose name is a SQL injection attack", and it seems to help them remember me.
9. Re:xkydgtufhlofhil by nmb3000 · 2010-04-02 04:16 · Score: 2, Insightful
  
  Well color me red, here I thought this kind of testing should have been done prior to release. Guess the new model of software development is to have the users discover the bugs (can I get a smiley on this) instead of paying a QA team to test.
  No, color you stupid. Office 2010 hasn't been released yet.
  Nice try though.
  
  --
  "What do you despise? By this are you truly known." --Princess Irulan, Manual of Muad'Dib
  /)
10. Re:xkydgtufhlofhil by jonadab · 2010-04-02 13:10 · Score: 2, Interesting
  
  It isn't just names, either. Apostrophes and other special characters show up all kinds of places in data where naive programmers tend to imagine they won't appear. Did you know a less-than symbol can show up in a book title? Oh, yes, and if you aren't doing entity-encoding when you build HTML from the data you will get a surprise. With experience, you eventually learn to write the code so that it will either accept those characters as part of the data and handle them as such, or in cases where that's not desirable (like, say, non-numeric characters in a year field) catch them preemptively and issue a clear error message to the user. SQL injection is a particularly easy thing to fix, because you can just use prepared statements with bound variables, but nearly every program of any size or complexity is going to run into situations where it has to do more complex data checking. User-entered data is going to have stuff in it that you didn't anticipate. Every programmer has to learn this lesson, and most have to learn it repeatedly until they eventually become near-paranoid and borderline obsessive about it.
  
  I was gratified when users came to me complaining that they got an error message about time travel not being permitted. Ha! I *knew* it was a good idea to write a test for the end of an appointment being before the beginning of the same appointment. I don't even want to *think* about the bugs that would have ensued if that data had got into the database. The routine that checks whether a room is free at a certain time wouldn't have handled it correctly, that's for certain.
  
  You're not just paranoid. The data really are out to get you. You have to be ready for them, ready for *anything* they may throw at your code. If you're not careful, they'll get you.
  
  --
  Cut that out, or I will ship you to Norilsk in a box.
Hey, Microsoft! by geminidomino · 2010-04-01 21:15 · Score: 5, Funny

"We also want to fix things that are not security concerns."
It's 5AM EST. April Fools' day is over everywhere but a few pacific islands. Give it up already.
1. Re:Hey, Microsoft! by somersault · 2010-04-01 21:20 · Score: 2, Funny
  
  While a large number, it's important to note that that doesn't mean we found 1,800 security issues
  Don't worry, we all know that you haven't fixed any security issues.
  
  --
  which is totally what she said
2. Re:Hey, Microsoft! by PolygamousRanchKid+ · 2010-04-01 21:21 · Score: 4, Insightful
  
  Note that he said "want" and not "will".
  
  --
  Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
New bugs by El_Muerte_TDS · 2010-04-01 21:17 · Score: 5, Insightful

I wonder how many "new" bugs they'll create by fixing the found bugs.
Anyway, nice to see that they're performing fuzzing tests, not enough people/companies do that. There's also quite little tool support for it.
1. Re:New bugs by GF678 · 2010-04-01 22:18 · Score: 2, Interesting
  
  I wonder how many "new" bugs they'll create by fixing the found bugs
  Yeah, just like the numerous regressions I see in the Linux kernel, WINE, Ubuntu releases etc.
2. Re:New bugs by beakerMeep · 2010-04-01 22:38 · Score: 4, Funny
  
  fuzzing tools probably wont ever gain wide spread acceptance outside of the furry community though.
  
  --
  meep
If only this was easier... by Manip · 2010-04-01 21:32 · Score: 2, Interesting

This is a great methodology of testing but to be honest I'm not sure it is within the scope of most software firms. While I'm sure we could all drop entirely random data into a parser and see if it fails, to REALLY conduct a test you have to do the same thing broken down by data element in the file format and then for each of those test both realistic and unrealistic test cases.
Then you throw on top of that UI and Web-Page fuzzing and you now have to somehow hook every element on a site and throw in random data which is not realistic with a large rich application.
1. Re:If only this was easier... by somersault · 2010-04-01 21:55 · Score: 4, Informative
  
  The whole point of the data is that it's unrealistic. There are a few tools out there for doing this type of testing, or easily modified to do it. I haven't used many testing tools but you could take something like Skipfish and add in some fuzz testing pretty easily.
  
  --
  which is totally what she said
2. Re:If only this was easier... by owlstead · 2010-04-01 22:26 · Score: 5, Insightful
  
  As with all testing tools, the more of them you use, the better. There are many reasons why you don't want to employ all tests, e.g. lack of knowledge, lack of manpower, lack of money or lack of time. The good thing is that if you can get them automated, then they quickly become affordable.
  For an example: I was thinking if it was wise to put findbugs (which works on compiled byte code) next to checkstyle (which works on source code level) in my Java project. Obviously I put them both in; they duplicate bugs but who cares ? I'll just look at checkstyle first and findbugs second. If I can put in a pre-build fuzzing component I probably will.
  But fuzzing tools are different than unit tests. Fuzzing can never cover every nook and cranny. They will produce reports that are much less readable, and that cannot be directly tied to particular events (e.g. during regression testing). If anything, they'll put some pressure on developers to put in more unit tests; if the fuzzing tool finds many bugs in a component, it should be a good indicator that even the basic unit tests have not been created.
3. Re:If only this was easier... by SharpFang · 2010-04-01 23:01 · Score: 2, Informative
  
  A fuzzer isn't really hard to write.
  Pick a word-based variant of Dissociated Press that requires similarity a random number of words back/ahead and allows split on special characters (separators) besides whitespaces. Feed it a lot of your actual files. Actually, the amount of data it can produce may be vastly bigger than the amount of data it takes in, because it can jump back and forth in the input files recombining their fragments multiple times.
  Of course then you need a test unit that feeds the fuzz to your program.
  
  --
  45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
4. Re:If only this was easier... by digitig · 2010-04-02 01:46 · Score: 2, Interesting
  
  The solid theoretical ground would be fine if they were starting from scratch now (and looking at some of the research coming out of MS, they do seem to be trying for it -- we'll have to wait and see whether it delivers). One big problem for them, though, is maintaining compatibility with ealier versions of Office which were not written using what is now current best practice. Once you start trying to implement code with behaviour that's not properly understood, or pulling in code that's not properly understood, then that best practice is some help, but it doesn't give you the robustness you might want. The alternative would be to abandon back-compatibility, but that would throw away all their (perceived) lock-in and make it too easy for customers to jump ship, so that would probably be a bad business decision.
  
  --
  Quidnam Latine loqui modo coepi?
5. Re:If only this was easier... by BitZtream · 2010-04-02 02:29 · Score: 2, Interesting
  
  Its only a great model for testing if you've exhausted the extensive list of known bugs that people hit every day under common circumstances.
  Finding bugs in the file format is great and all, but fixing the bugs that users actually see every day is far more important and you can reset assured it will be released with a bucket load of very obvious bugs that should have been fixed rather than dicking around throwing random data at it.
  I know there are potential security issues to deal with and those are important, but they still aren't as important as the users experience with the software and actually getting their own job done. Saying 'don't open word docs from someone else until this is fixed!' is a lot more practical than hearing that person say 'I'm not using Word, this retarded table layout bug is pissing me off, can we find something else to use instead of Word?'
  I'm not saying they shouldn't, I'm just saying their priorities are wrong on a scale that is hardly imaginable.
  As far as being realistic with a large rich application ...
  Citation Needed.
  Given enough processors sitting around the size of the application or its feature set becomes of little concern. It may take longer but thats no excuse to not do it.
  
  --
  Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
6. Re:If only this was easier... by plover · 2010-04-02 03:07 · Score: 2, Interesting
  
  I wouldn't knock what they're doing. As we've recently seen with Adobe, exploits in the payload format can be used to manipulate users and even launch code. And remember how we used to be all panicky about Word macro exploitations until the defaults were changed to shut them off? "Good times", indeed.
  Consider that Microsoft dominates the market, and that the ".DOC" format is widely accepted across companies. Nowadays .DOC files are readily passed by email filters, web filters, etc. Office workers open them in previewers and Word without giving a second thought to security.
  A buffer overrun or other fault in the handling of .DOC files could offer a hacker a way to deliver a malicious payload through channels that are today trusted worldwide. For all we know, these could already be exploited by phishing attacks.
  It's definitely worth Microsoft's time and effort to execute these tests.
  
  --
  John
Re:"Botnet?" by nacturation · 2010-04-01 21:44 · Score: 4, Funny

FTFA:

Microsoft was able to find such a large number of bugs in Office 2010 by using not only machines in the company's labs, but also under-utilitized or idle PCs throughout the company. The concept isn't new: The Search for Extraterrestrial Intelligence (SETI@home) project may have been the first to popularize the practice, and remains the largest, but it's also been used to crunch numbers in medical research and to find the world's largest prime number.
"We call it a botnet for fuzzing," said Gallagher, referring to what Microsoft has formally dubbed Distributed Fuzzing Framework (DFF). The fuzzing network originated with work by David Conger, a software design engineer on the Access team.
Odd that they would call it that publicly, given the negative connotation of the word. I would have called it "fuzzy clouds grid computing" or something like that.

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Re:"Botnet?" by Mathinker · 2010-04-01 21:48 · Score: 4, Funny

Let me explain: Microsoft discovered that all of their desktop computers were zombied with malware, and after wresting control from the botnet C&C, decided to take advantage of this increased ability to remotely administer their computers to run QA tests, on the off chance there might be some need for it.
</joke>
Re:"Botnet?" by benjamindees · 2010-04-01 22:11 · Score: 5, Funny

They had to infect the computers with Office 2010.

--
"I assumed blithely that there were no elves out there in the darkness"
Re:"Botnet?" by El_Muerte_TDS · 2010-04-01 22:28 · Score: 4, Funny

"Cluster Fuzzed" would be much better, specially when somebody finds a remote exploit in their cluster code, then Microsoft will be cluster fucked.
Re:"Botnet?" by laederkeps · 2010-04-01 22:30 · Score: 2, Funny

So the project is a "Cluster fuzz" ?
Re:Speaks to the complexity by zippthorne · 2010-04-01 22:50 · Score: 4, Insightful

Your point being? In 10 years since I started using it, I still don't know all the Vi commands and Emacs is so daunting I never even attempted it.

--
Can you be Even More Awesome?!
Or user a sales. by leuk_he · 2010-04-01 23:19 · Score: 2, Interesting

It is an alternative to the monkey test: Take a sales person from across the ahlloway and let him click on your application. If it does not crash or give absurd error messages you can do the actual testing.
GIGO!
Re:"Botnet?" by shutdown+-p+now · 2010-04-01 23:22 · Score: 2, Informative

Odd that they would call it that publicly, given the negative connotation of the word. I would have called it "fuzzy clouds grid computing" or something like that.
Developers tend to name things that are used internally in a way that is short and more to the point, which is not necessarily something perfect for marketing/PR.
Sometimes these things slip through.
Re:1800 down, 10,000,000 to go by swilver · 2010-04-01 23:35 · Score: 2, Funny

The same as I thought. Tip, meet iceberg.
One would think that this is the case... by WD · 2010-04-01 23:41 · Score: 2, Interesting

What you describe is "smart" or "generational" fuzzing, where you have a detailed knowledge of the target that you are fuzzing. The thing is, dumb (mutational) fuzzing is still effective. Very effective. Check out Charlie Miller's CanSecWest presentation - An analysis of fuzzing 4 products with 5 lines of Python
http://securityevaluators.com/files/slides/cmiller_CSW_2010.ppt
In 3 weeks of (really) dumb fuzzing, 174 unique crashes in PowerPoint were discovered.
that doesn't mean we found 1,800 security issues by Geminii · 2010-04-02 02:01 · Score: 3, Insightful

it's important to note that that doesn't mean we found 1,800 security issues.
"...we have absolutely no idea where THOSE are."
Re:Speaks to the complexity by 140Mandak262Jamuna · 2010-04-02 02:50 · Score: 2, Interesting

So why don't you do something instead of constantly griping? Find some open source project that comes close to what you want and contribute to it. Even if you are not a developer, work on documentation, testing, bug reporting or something.

--
sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
No surprise, with that "format"! by Hurricane78 · 2010-04-02 03:14 · Score: 3, Insightful

Have you even seen the “specification” that MS tried to make a standard. It’s a horribly convoluted mess, that can only be described as an upside-down pyramid of always patching new stuff onto the old framework, while never doing a needed complete re-design. Like Windows ME.
Hey Microsoft! If there are more bugs than features in your file format, maybe you should do a re-design, hm? ;)

--
Any sufficiently advanced intelligence is indistinguishable from stupidity.
Dumbest possible way to not find errors by Ancient_Hacker · 2010-04-02 03:26 · Score: 2, Insightful

Remember the very obvious maxim of Dykstra: testing can only tell you there ARE errors, it can't tell you there AREN'T errors.
Randomly poking at data only find you the very dumbest errors. It takes some real thinking and mulling to realize, hey, if a xml field crosses this buffer boundary, and the last 4-byte Unicode code was cached, it's going to get bashed by the next 3-byte escape code. Or 255 bytes of code-page Yen symbol (255) followed by a 254 will lead to sign-extension and access to an address in the kernel trampoline DLL. Those kind of combinatorial errors are not going to be discovered by random poking at the data.
So they're going to (and have) given everybody a false sense of security, when the basic method can do nothing of the sort. it can only fin errors of the most trivial sort. It can't find errors that thousands of unemployed Russian hackers can dream up of testing for, and it can only FIND errors, not tell you there aren't an unlimited number of remaining errors.
It's not a botnet. by NotBornYesterday · 2010-04-02 03:52 · Score: 2, Insightful

It's distributed computing.

Wait, I suppose it could be a botnet, if MS's IT department distributed the required software by exploiting security holes in the victim OS instead of just using admin rights to install the new app. Come to think of it, that might be easier ... [me scurries off to develop new easy-to-use set of malware-based admin tools].

--
I prefer rogues to imbeciles because they sometimes take a rest.
Re:wow imagine that by natehoy · 2010-04-02 04:04 · Score: 2, Insightful

Yes, I've taken that class.
I'm not talking about testing, I'm talking about design. If you expect a URL in a field and someone puts executable code in there, you should not be executing the code - you should be rejecting the URL. Data of that nature should not be put in a memory area where an instruction can be sent to run it.
Stack overflows, buffer underruns, and things of that nature are not things that should be caught in testing. They are things that should be prevented in the first place. If your code can't write data from strangers in places it can execute it, you can't be caught with your pants around your ankles when someone sends you executable code in a text field.
I'm not saying this testing is a bad thing, it's great, and necessary, and wonderful, and all that! But I sincerely hope Microsoft learned the lesson and Office 2012 or whatever the next version is will at least get some protected mode lovin' so they can separate data space from execution space and stop crossing the streams.
Maybe then Patch Tuesday will stop being so darned exciting.

--
"This post contains words, known to the State of California to cause thought. Wash brain thoroughly after reading."
Re:wow imagine that by Blakey+Rat · 2010-04-02 05:42 · Score: 2, Interesting

I don't know of anyone who does regular fuzzy testing. Everyone that matters does unit testing.
Just FYI, Microsoft does fuzz testing in all areas of business, not just Office. The "news" here is really that the Office fuzz testing is done with a cluster of the developers' own computers. (Although it's definitely a good story to get out to all the shitty software houses out there that don't already do fuzz testing.)
When I worked in Xbox game testing back when the Xbox 360 was shiny and new, we had a large pile of Xbox 360s that did nothing but fuzz-testing of new titles by feeding them random controller input.

--
Comment of the year