IE Shines On Broken Code
mschaef writes "While reading Larry Osterman'a blog (He's a long time Microsoftie, having worked on products dating back to DOS 4.0), I ran across this BugTraq entry on web browser security. Basically, the story is that Michael Zalewski started feeding randomly malformed HTML into Microsoft Internet Explorer, Mozilla, Opera, Lynx, and Links and watching what happened. Bottom line: 'All browsers but Microsoft Internet Explorer kept crashing on a regular basis due to NULL pointer references, memory corruption, buffer overflows, sometimes memory exhaustion; taking several minutes on average to encounter a tag they couldn't parse.' If you want to try this at home, he's also provided the tools he used in the BugTraq entry."
Since it may not be obvious to all readers, be aware that when you can make a program crash by feeding it bad data, you can typically further manipulate the data you are sending it to take control of the program. That means a security hole. This is how buffer-overruns work. You can't always do it, but you can think of each way you can crash a program as a little "crack" in its exterior. If you can figure out a way to pry apart the crack, you've got yourself a hole.
.NET languages, rather than programming in straight C like the others) which as a side effect prevented such problems.
So many of these "bugs" in Mozilla, Opera, Lynx, and Links are likely security holes as well.
It is interesting, then, to see that Internet Explorer did so well on this, with its notoriously bad history on security. My first instinct would be that the HTML parsing engine in Internet Explorer was written by a different team of programmers than worked on the rest of the software, and they used proper programming techniques (such as RAII in C++, or perhaps used one of their
Let's hope that all these bugs are taken care of in the other browsers quickly before the black hats find ways to make use of them.
It's hard for thee to kick against the pricks.
There's a good phrase I can use to explain this one:
If you work in a monkey house, you expect to be pelted with shit.
I'd love to read the article, but the page seems to contain malformed HTML...
They didn't say that IE also started randomly installing Bonzi Buddy et al during the test, the users' credit card numbers were automagically emailed to Romania, there was an sudden increase in outbound port 25 traffic from the system, and they ended the session with about 37 momre toolbars installed then they started with.
Aparently, XPSP2 (including the new IE) was recompiled with the latest visual studio and with all the options turned on to better catch issues.
Potentially.
does the fact that they do crash a good thing?
No. Never ever is it a good idea to crash on receipt of invalid data. It's up to the program to try and parse this, realise it can't do so successfully, then act ccordingly (error message, best-guess try, whatever. I prefer error message myself, but can understand those who prefer best-guess).
Cheers,
Ian
I don't know if they still use it, but the Linux kernel developers used to use a program called "crashme" to help test kernel stability. Essentially, it generated random code and tried to execute it. Something like this for web browsers would make for a very useful procedure. Generate the code, throw it at the browser and log the code if it crashed the browser.
Nothing crashed. I got blank pages, all the weird HTML and all, but no errors and nothing crashed. w00t.
Great, but then it also encourages people to write bad code - see all that code with broken tables and a million tags that remain unclosed?
You're confusing two seperate things here:
This guy has been testing for (2) and not (1). Bad HTML should never cause crashes, memory corruption and buffer overflows. Period.
Finally, you can't go blaming the users for bad input. One of the golden rules of software design is that all software should either reject or handle gracefully bad input. Crashing is not graceful.
Avantslash - View Slashdot cleanly on your mobile phone.
None of the samples in http://lcamtuf.coredump.cx/mangleme/gallery/ was able to crash Konqueror from KDE CVS Head. Heheh time to praise Khtml developers again!
Never learn by your mistakes, if you do you may never dare to try again
I saw something like this (not quite, but similar) a few years ago working with Java Script.
I wasn't that experienced with it, and as a result, certain pieces of my code were syntactically incorrect. Specifically, I was using the wrong characters for array indexing; I think I was using "()" instead of "[]". I would never have known there was even a problem if I hadn't been doing side by side testing with IE and Mozilla. A page that rendered correctly in IE would always show errors in Mozilla. This made absolutely no sense to me.
It wasn't until I viewed the source generated by each browser that I discovered the problem. IE was dynamically rewriting my JavaScript, replacing the incorrect delimiters with the correct ones, whereas Mozilla was simply taking my buggy code at face value.
XHTML is supposed to be refused if malformed; HTML prior to 4.0 is supposed to be best-guessed. I'm not sure what the behaviour of 4.0 Transitional and 4.0 Strict is supposed to be, but I'm sure it's documented as part of the spec.
This isn' insightful at all. First, you'll be the first person to bitch when a mozilla virus comes out.
Second, "crashing when invalid" as you and many others are alluding to is NOT a good idea. What if you had another tab open with email/urls/info you needed?
What if other software took this route? Invalid operands to open()? Time to crash. Invalid socket used in send()? Time to crash. Segfault in application? Kill the kernel processes!
It's a problem, it has to be fixed and there aren't two ways about it.
Tom
Someday, I'll have a real sig.
I'd really prefer it to just refuse to parse the page mentioning that the code is bad instead of crash. As much as I like Firefox/Moz, when a piece of software is fed bad data, it should say so, not die on the spot, ever.
People replying to my sig annoy me. That's why I change it all the time.
I may be a little paranoid (heck, I actually am) but I've long suspected the IE support for loose HTML was a strategic decision. Go back to the days when Netscape would render a page with a unclosed table tag as blank. IE rendered the page, and I often encountered sites that didn't work on Netscape.
It could be a coincidence, but the loose HTML support of IE led to a situation where some webmasters conclude that Netscape had poor HTML support. You can argue about standards all day long, but if one browser renders and another crashes or comes up blank there isn't much of a contest.
-- Solaris Central - http://w
Here's the mozilla_die1.html codeAnd the mozilla_die2.html codeIt looks like he came across places where either boundary checks or type checks are not in place.
Besides, he's had access to almost all the browswer code, hasn't he?
I mean, these bugs are bad, but I'm sure if I had access to IE's code I could come up with a zillion bugs.
No. I don't care how bad the input is, if my program reads the input and throws an access violation, then it is my job to fix my program, test the input more, assume less about it or whatever, until my program does something more sensible and less dangerous with the input - like giving up with an error message or even an assertion failure.
I repeat: code that crashes with a null pointer error is wrong. End of story.
My Karma: ran over your Dogma
StrawberryFrog
http://lcamtuf.coredump.cx/mangleme/gallery/
I opened all the pages in tabs in Firefox 0.10.1 under Windows 2000, and Firefox did not crash. It became somewhat unresponsive, but I could still select other tabs, minimise and maximise. I could not load new pages anymore.
Can someone else test this as well, please?
And can someone tell us whether this has security implications or not?
Ugh. Not the best written Slashdot entry.
:-)
Larry Osterman -- former Microsoft guy; someone forwarded him a post to Bugtraq.
Michael Zalewski -- absurdly brilliant security engineer out of Poland. Did the pioneering work on visualizing randomness of network stacks, passively identifying operating systems on networks, and way way more.
Nothing bad against Larry. But this is all Zalewski
--Dan
Thats the thing about randomness. You can never be sure.
/. is a bunch of nerds at a million typewriters. It's not a political conspiracy determined to undermine your beliefs.
> all the error correction code helps to keep IE bloated and slow.
Bloated compared to what?!
Slow compared to what?
IE has quite a small footprint for a web browser. I've opened this page in IE and Firefox. Currently IE is using 19Mb of ram and Firefox is using 28Mb. In fact, currently the top three processes using the most RAM on my machine are all open source products (the top two being Firefox and the enormously memory hungry Thunderbird which is currently using 58mb of RAM). All the commercial software comes later.
IE also tends to render pages faster than Firefox under most circumstances (except where Linux advocate article authors have carefully crafted CSS heavy pages which cause IE to slow down a bit).
Nevermind using random garbage to crash a browser, you can make IE6 crash with perfectly valid strict HTML.
Try this page in IE6 and then hover your pointer over the link. Crash!!!
-
So what? I have never had a problem with my Firefox crashing (ever). Sure, if you try to make something crash, it eventually will. Considering how much security holes IE has, IE could be the missing link, and I still wouldnt use it.
Just because you haven't crashed it doesn't mean it's not happening. I switched my Mom over to Firefox for her computer's safety about 2 months back. She's still using it, but it crashes for her regularly and it's becoming a big frustration for her. As she put it "why does Firefox crash so much, IE never crashed on me?" If Mozilla/Firefox/Opera/etc. hope to continue gaining ground on IE, then this type of thing needs to be addressed.As I see it the major problem that Mozilla/Firefox has is the vast majority of those using it (and most definitely the vast majority bothering to report bugs/crashes) are techies. Why is that a problem? Well we probably don't spend our time to going to "silly" E-card sites and joke sites that use bad flash/html. Sure we can dismiss those sites as not important, because to us they aren't, but to a large portion of the average users out there they're one of the most important things they do in a browser because to them they're fun.
So I'm betting Mozilla/Firefox actually crashes regularly on non-techies simply because they visit sites that most techies don't bother to test the browser on.
Case in point.
Last week I wrote some Perl to process an mbox mail folder. I just wanted a quick and dirty way to view its contents in a web page. A couple of CPAN modules and a few dozen lines of code and thing was done. Then I started to get fancy and dealing with stuff like embedded MIME-encoded GIF images. This was pretty simple to do, but I made a mistake. Once I had the decoded GIF data lying around, I wrote it to the HTML file of the current e-mail message, rather than writing it to a seperate file and writting <img src="foo.gif"> in the HTML file.
I was viewing the results with Firefox 0.10.1. When it got to a message with an embedded GIF, with a big slodge of GIF binary data sitting in the middle of the page, Firefox either just sat there spinning its hourglass, or crashed and burned.
Then I looked at the same file with IE, and the GIF image showed up. I was puzzled for a while until I noticed that in the directory where I had created the file, no GIF files had been created. It is of course arguable that IE should not have attempted to render the GIF image from the binary data sitting in the middle of the page, but it did so without complaint. Not rendering it would also be acceptable.
Firefox, on the other hand, has a number of better alternatives to crashing or hanging. Should it display gibberish (like when you forget to set up your bz2 association correctly) or nothing, or the image? I don't know, and don't particularly care about which course of action is taken. Anything is better than crashing, especially when IE doesn't.
Anyway, I fixed the Perl code, and all is well.
The End
One guy with ten minutes came up with ways to crash Mozilla, Lynx, and Links, yet hundreds of thousands of programmers with years of access to the same code haven't fixed these same bugs.
XHTML is supposed to be refused if malformed; HTML prior to 4.0 is supposed to be best-guessed.
This reenforces my belief that XHTML is the way forward since it reduces the code complexity of the browser:
XHTML: Try to parse - fail - give up
HTML: Try to parse - fail - Try to reconstruct - hit bug - crash
XHTML is also good because it removes the fuzzy area of what to do if the code is crap - with HTML, a web developer will write a page, won't bother to validate it and just check it works in IE. Since different browsers have different methods of fixing broken code, the results of this page are not platform independent. With XHTML, if the developer writes broken code it just plain won't work. The management who pay the web developer probably don't know anything about standards compliance and if it works in IE the developer gets paid, but if it just sits there with a parse error the developer will either have to fix it or not get paid (Good Thing).
That said, IMHO there is something to be said for a couple of additions to the XHTML spec:
1. a button on the "parse error" page which tells the browser to render it as tag soup - that way the end user can try to view the page anyway even if it's broken (whilest still being informed that it really is broken code).
2. an automatic feedback system in which the browser will post details of the parse error back to the server. Otherwise the developer may never know there's a problem (especially important with dynamically generated markup which may not be easilly validated).
Similarly, it would be really nice, IMHO, if browsers made it clear (by placing a big X on the status bar or something) when they are viewing broken *HTML* code since this would indicate to the user why the page might not look quite right and would be an indication to the management not to pay the web designer they hired since he is obviously lacking in the ability to do his job.
http://blog.nexusuk.org
As he stated in the article; the crashes are sometimes platform-specific.
I've tried this in 1.0PR firefox on win32, and the crashes do occur there.
I've gotta say - this really looks like a great tool; a simple and effective way of finding some bugs!
--Eamon
With such a powerful parsing engine you would thing IE could parse web standards a little better.
Has it ever occurred to you that it is in MS interest to parse bad HTML? Maybe even to encourage bad HTML so IE is considered the best browser by the man in the street. Now where's my tin foil hat?
I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
... and here's why.
With correct data (in this case, HTML), there is a specified action that is "correct". In other words, a correctly marked up table will get layed out, according to the W3C rules for laying out tables. A paragraph will get formatted as a a paragraph, etc.
With malformed markup, the "correct" thing to do is indeterminate. If every browser just takes its best guess, they will all diverge, and the behavior is wildly unpredictable. Even from version to version of the same browser, the "best guess" will change.
"So? You've just described the web!" Well, exactly, but it could have been avoided. Bad markup shouldn't render. It ain't rocket science to do (or generate, though that can be a harder problem) correct markup. If you had do it to get your pages viewed, you would. Ultimately, it wouldn't cost anymore, and would actually cost less (measure twice, cut once).
Of course, what I just wrote only really applies in a heterogenous environment ... which MS doesn't want ... fault tolerance in your own little fiefdom can make sense.
HTML is out there, and millions of malformed pages exist. Most of this is a result of mistakes by authors, but some of it is a result of the moving target that HTML has presented in the past. /. story is also about how various browsers, including the saintly Firefox, can be made to *crash* given certain input. Just thought that should get a mention :)
While your argument is attractive in principal, in practice it's misguided. The horse has bolted. in 2004, no-one would use a browser that didnt work with a huge proportion of the web's content. This is an area where pragmatism is required.
And to respond to the ubiquitous MS-bash, let's step back and remind ourselves that this
(And BTW, I speak as a Firefox user)
A tool like this would let the average wanna be contributer find a reproducable bugs and try to fix them. Which brings me to my dumb question: Is the Mozilla gecko engine more easily built/tested than the whole of Firefox? I love FF, and wouldn't mind throwing some cycles at improving it, but the entire build process is a bit more than I really want to take on... If I could just build and unit-test the failing component I'd be more likely to try.
Anyone have pointers beyond the hacking section at MozillaZine?
Just to be clear, unparseable XHTML is not XHTML. In "Matrix" terms, there is no web page. Instead, there is a string of text that may resemble XHTML to the casual observer but that doesn't really represent anything at all.
Arguing that browsers should half-support broken XHTML is like saying that a C compiler should do something whenever it encounters invalid C, since the user obviously wants to run the code and isn't interested in bowing to the pedantic demands of some irrelevant standards committee.
One is rather more important than the other in this context.
I agree completely, but I don't think it's the one that you picked.
Dewey, what part of this looks like authorities should be involved?
On any given day we know of many HTML inputs that will crash Mozilla, and many that will crash IE, and ditto for other browsers. Which ones get fixed is simply a matter of priorities. And we prioritize by looking at the crash to see if it looks like it could be turned into a security hole; looking at talkback data to see which crashes people are hitting most frequently; focusing on the ones that occur on actual real websites, and maybe after that when there's nothing else to do we fix the ones exposed by artificial testcases.
No-one has enough resources to fix every bug, not even Microsoft.
Does this mean we've all been wrong about Microsoft products?
Actually yes. People here always talk about Microsoft products being buggier than the average, without any evidence to back it up beyond their own prejudices.
They use to laugh at the "much inferior" IE code, until the mozilla project got started and it turned out Netscape had the inferior code base.
OSSers used to laugh at the "bloat" of the windows source code.... until Linux got to have a decent user interface that is, and guess what? source code size is comparable to Windows.
There are many reasons to loathe the evil empire (monopolistic bully for one), but buggy code is not one of them. That is just something OSSers tell each other to feel better about what they do.
Netscape used to crash very often. Looks like the Mozilla people didn't learn much from it.
b id=182569">Description of Internet Explorer security zones registry entries</a>
n dows\Curr entVersion\Internet Settings\Zones\0\Flags
w s\Curre ntVersion\Internet Settings\Zones\0\Flags
Mozilla is just as sucky security-wise as the old non-mozilla Netscape (3.x 4.x). Whether it is OSS or not doesn't make it secure/insecure, it's the programmers that count. Look at Sendmail and Bind (and many other ISC software), security problems year after year for many years. Look at PHPNuke - security problems month after month for years. Look at OpenSSL and OpenSSH and Apache 2.x - not very good track records. Compare with Postfix and qmail, djbdns.
Most programmers should stick to writing their programs in languages where the equivalent of "spelling and grammar" errors don't cause execution of arbitrary attacker-code. Sure after a while some writers learn how to spell and their grammar improves but it sometimes takes years. For security you need _perfection_ in critical areas, and you need to be able to identify and isolate the critical areas _perfectly_ in your architecture.
To the ignorant people who don't get it. Crashing is bad. A crash occurs when the (browser) process write/read data from areas where it shouldn't be touching, or tries to execute code where it shouldn't be executing. This often occurs when the process somehow mistakenly executes _data_ supplied by the attacker/bug finder, or returns to addresses supplied by the attacker...
This sort of thing is what allows people to take over your browser, and screw up your data (and possibly take over your computer if you run the browser using an account with too many privileges).
So while the FireFox people get their code up to scratch maybe people should reconsider IE - IE isn't so dangerous when configured correctly. Unfortunately it's not that simple to do that.
To make even unpatched IE browsers invulnerable to 95% of the IE problems just turn off Active Scripting and ActiveX for all zones except very trusted zones which will never have malicious data. Since I don't trust Microsoft's trusted zone (XP has *.microsoft.com as trusted even though it doesn't show up in the menus), I create a custom zone and make that MY trusted zone.
By all zones I mean you must turn those stuff off for the My Computer zone as well - but that screws up Windows Explorer in the default view mode (which is unsafe anyway).
For more info read this: <a href="http://support.microsoft.com/default.aspx?k
To make the My Computer zone visible change:
(for computer wide policy)
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Wi
To: 0x00000001
(for just a particular user)
HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windo
To: 0x00000001
If you don't want to edit the registry and make the My Computer zone visible, you can still control the My Computer Zone settings from the group policy editor (gpedit.msc) or the active directory policy editor.
You just have to know some Microsoft stuff. But hey, securing an OSS O/S and _keeping_ it secure (esp when u need to run lots of 3rd party software) also requires some in-depth knowledge.
Jon Postel said it best: "Be liberal in what you accept, and conservative in what you send." A web browser that crashes due to invalid HTML fails this test. Execution must stop at once... yes, but the program should handle the error. A program that crashes is the Operating System handling a program with bug. See RFC 1122 for more details about the Requirements for Internet Hosts. Section 1.2.2 about the Robustness Principle explains better than I can why you're wrong.
I have to ask:
When saying that Microsoft Internet Explorer didn't crash, does he mean that the window never went away, or that the program iexplore.exe stayed running? I can't prove it, but I suspect that the "IE" window would survive a crash of the rendering engine, because the window is actually provided by explorer.exe, which is the desktop manager.
I also suspect that several of the open source browsers could defend themselves against this kind of crash within a day or two, simply be using a two process model. Personally, I would rather they did not! (I want to see it fail, otherwise I would not know something was wrong.)