Zlib Security Flaw Could Cause Widespread Trouble
BlueSharpieOfDoom writes "Whitedust has an interesting article posted about the new zlib buffer overflow. It affects countless software applications, even on Microsoft Windows. Some of the most affected application are those that are able to use the PNG graphic format, as zlib is wildely used in compression of PNG images. Zlib was also in the news in 2002 because of a flaw found in the way it handled memory allocation. The new hole could allow remote attackers to crash the vulnerable program or even the possiblity of executing arbitrary code."
Why are we still having buffer overflows? There's a compile option in Visual C++ that allows automatic buffer overflow protection. Does GCC have this switch? If so, why not? And why are people not using this? We have enough processing power on a typical PC to spend on these security such as this. Performance is not an excuse.
Looking further, this is an interesting example of the problems with monoculture. The BSD TCP/IP stack was copied for Windows and Mac OSX - this is great, it saves a tonne of time but you also means you inherit the exact same bugs as the BSD stack. This gives you an impression of how difficult it is to design secure operating system. If you borrow code such as this, you have to make sure it's secure. You can't really do that without line by line analysis which is unrealistic. In libraries the problem is especially accute. If you make a mistake in a well used library it could effect hundreds of pieces of software, as we've seen here.
We can't modularise security either, like we can modularise functionality, because you can take two secure components and put them together and get insecurity. Despite the grand claims people make about formal verification, even this isn't enough. The problem with formal verification is that the abstraction of the language you're using to obtain your proof may not adequately represent the way the compiler actually compiles the program. Besides, it's possible to engineer a compiler that deliberately miscompiles itself such that it compiles programs with security flaws in it.
What i'm trying to say is that despite what the zealots say, achieving security in software is impossible. The best we can do migitate the risk the best we can. The lesson to learn from security flaws such as this is that while code-reuse is good for maintainability and productivity, for security it's not great. As always, security is a trade-off and the trade-off here is whether we want to develop easy to maintain software quickly or whether we want to run the risk of these exploits being exploited. Personally, I fall in the code-reuse camp.
Simon.
I wonder if it'd be possible to create a binary patch for prebuilt binaries ?
Anyone got some suggestions?
Yes, both Visual C++ and the GCC ProPolice extensions provide stack and heap protection. And in general these techniques have a minimal impact on execution speed. Unfortunately, this does not solve the problem. There are still viable attacks that can be preformed by avoiding the stack canaries or heap allocation headers and overwriting other vulnerable data. The probability of developing a successful exploit is lower, but it's still there.
I don't disagree that building secure applications is hard, but it's certainly not impossible. Modularized code just adds another layer of compilcation and potentially confusion. Most of this can be addressed by documenting the design and interface constraints, and ensuring that they're followed. At that point even most security vulnerabilities are primarily implementation defects. Defects will of course still occur, but the trick is to build systems that fail gracefully.
Developers must to account for defects and expect that every form of security protection will fail given enough time and effort. This is why the concept of "Defense in Depth" is so important. By layering protective measures you provide a level of security such that multiple layers have to fail before a compromise becomes truly serious. Combine that with logging and monitoring, and a successful attack while usually be identified before damage is done.
Take the above vulnerabiliy and assume it exists in an exploitable form in a web app running on Apache with a Postgres backend. If the server had been configured from a "Defense in Depth" perspective it would be running in a chroot jail as a low privilege account. Any database access required would be performed through a set of stored procedures or a middleware component that validates the user session and restricts data access. SELinux or GRSecurity would be used for fine grained user control on all running processes. All executables would also be compiled with stack and buffer protection.
In the above scenario, you see that this single exploit wouldn't get you much. However, most systems are deployed with one layer of security, and that's the problem.
Has anyone read the zlib code? While the author clearly tried to make it readable it's still very complex and it's very hard to see at a glance or even after many glances where potential buffer overflow problems may exist (or even where it might fail to implement the deflate algorithm). C is great language for writing an operating system where all you care about is setting bits in a machine register but this algorithm really taxes its abilties.
For comparison here is the deflate algorithm written in Common Lisp. It all fits neaty into a few pages. This is a far better language comparison example than the oft-cited hello world comparison.
If the argument were that simple, static linking would never occur.
The flip side of the argument is that installing a broken zlib will break all application that are dynamically linked, but have no effect on those that are statically linked.
Remember too that an upgrade to a dynamically linked function means that proper testing must include all software that uses that function. A statically linked application can be tested as a standalone unit.
The resulting isolation of points of failure and lower MTTR is often seen as an advantage in production environments.
I remember this specific situation occurring in a production environment I worked in. A common library was updated, causing the failure of multiple critical applications. The ones not impacted? Statically linked.
Both sides of the discussion clearly have advantages and disadvantages; they have to be weighed to determine the proper risk/benefit.
Can You Say Linux? I Knew That You Could.
Not in the least bit, observe, just verified with ldd that Xorg and firefox have libz dynamically linked in on my system, which means on program restart, it will pick up the code from the shared library at runtime. It's the whole point of a dynamicly linked library.
Now once upon a time, a lot of distributions (and open source projects out of the box even) would just static link in libz for some reason or another, but after some security issues in the past that caused massive headaches for package maintainers, that practice has largely ceased.
XML is like violence. If it doesn't solve the problem, use more.
The flip side of the argument is that installing a broken zlib will break all application that are dynamically linked, but have no effect on those that are statically linked.
That's why some people like to use (say) debian stable in production environments: security fixes are backported to the well-tested version of the lib, making a breakage quite unlikely.
We're pretty badly off-topic here, but what the hey...
C was first designed and implemented in the time period from 1969-1973. It is hardly a critique of its original designers and implementors that we have learned a lot about programming language design and implementation in the succeeding 30+ years, and that many of the constraints of the computing environment have been weakened or removed during that time. Indeed, some of the original designers of C and UNIX spent a lot of time 10+ years ago developing an alternative language and runtime for writing operating system and application code that fixes the problems with C that I described.
"In fact, when you are coding things like process and memory mangement routines and libraries, it is very handy to be able to do arithmetic with and compare to variables that are not "exactly" the same type, if the comparison or operation otherwise makes sense. Hence, things like the boolean FALSE and integer 0 being equal (which Java will complain about) are handy."
If by "handy", parent meant "tempting" but "error-prone" and "potentially insecure", I think there's about 30 years' experience to back up this claim. Things as fundamental or important as my operating system's process or memory management routines are occasionally broken in particularly dangerous ways because their programmer did something that seemed to "make sense" at the time, even though a "safe" programming language wouldn't allow it. Go look at the changelogs of a recent UNIX kernel for plenty of examples.
"The lack of dynamic type checking, operand checking and bounds checking allows the programmer to write low level or system code that gets out of the way of higher level code." I'm sorry, but I don't know what this means.
"Imagine the performance degradation at the kernel if every comparison was dynamically checked for type, operand and bounds." One would prefer that operations be checked statically whenever possible. This is not so much for performance as because failed runtime checks in low-level code are difficult to handle gracefully. That said, as I mentioned in my previous post on this topic, we have known for a long time how to build programming languages so that a combination of static and runtime type and operand checking will provide some correctness guarantees without signficantly impacting execution performance. IMHO, it's way past time to start using that knowledge.