The Story Behind a Windows Security Patch Recall
bheer writes "Raymond Chen's blog has always been popular with Win32 developers and those interested in the odd bits of history that contribute to Windows' quirks. In a recent post, he talks about how an error he committed led to the recall of a Windows security patch."
Heck why not just go all the way an cut them loose?
"THERE IS NO JUSTICE, THERE IS ONLY ME." -Death
Why are the trolls out in force here? Oh, Microsoft... Nevermind...
Raymond Chen would be iFired, or at least told to iRTFM.
Seriously, it's good to get a glimpse of the interactions in the dev side of MS. It's astonishing that MS even allows this to happen at all. The March 07 Wired had a feature on Channel 9 that humanized the MS organization quite a bit, IMO. It's not just about chair-throwing, marketing hyperbole, and world domination after all... oh wait.
Science never settles, never rests.
This is fascinating. The system for exiting a process is so complicated that a lot of implementations fail. In fact, it's so complicated that even Microsoft can't get it right. Sounds like an unbounded loop to me.
Okay, he made an error. Why the HELL wasn't it caught in QA? Microsoft wants us to believe that the reason that we have to wait for patches is that they are getting some kind of exhaustive QA. This patch and executable were specifically created to avoid problems with invalid shell extensions. Don't you think that given that fact the thing to do would be to test it with some invalid shell extensions?
This is the reason that Windows admins have to be so much more paranoid about patches than the rest of us. A Windows patch is highly likely to be a big pile of crap that causes your system to not work properly. I think we can all remember certain service packs that broke various versions of Windows NT pretty much completely...
If you can't have confidence that security patches will fix more than they break, how can you have sufficient confidence to even install that vendor's products, let alone count on them for mission-critical applications?
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
I think the lesson here is not that this guy should have been more careful about programming, it's that no amount of careful programming can overcome a stupid design. It's stupid that there are magical filenames in the form of UUIDs that cause Explorer to load and run arbitrary DLLs. You can't get around this stupidity with some kind of speculative watchdog thread that works with what sound to me like some seriously questionable heuristics.
They should have simply got rid of the magic naming system in favor of something explicit, such as a Shell Extension Interface that a shell extension must fully implement.
This illustrates the kind of employee I like to have. One who can talk about his mistakes the same way he talks about anything else work-related.
Some years ago I myself made a rather expensive mistake which involved the design of an aircraft structure. The fellow I was working for at the time had one of those razor-blade intellects and I got called into his office for a chat. When he asked me what happened I had two choices, weasel or turkey. In engineering it's always possible to talk the complicated talk and hope to obfusticate your way out of a situation, but fortunately I said "I make a mistake." And you know what? That was exactly the answer he was looking for.
You see, the most important thing is not to be perfect, it's to be honest. That's what a boss, of which I am one now, wants.
If you have a boss that doesn't want that, better watch out for yourself.
Equine Mammals Are Considerably Smaller
On the day after Patch Tuesday, January 2006, I got a somewhat frantic call from a client. She's a lawyer, had a filing deadline, but could not save a document in MS Word. That's not all that this patch broke: you couldn't open My Computer or My Documents on the desktop (though you could navigate to them by typing the path in the Start -> Run box), and IE wouldn't let you type just "www.[website].com" in IE's address bar. You had to prepend the "http://".
.exe and .dll files that are named just like Windows system files. Keeps my foot bullet-free.
.exX.
I verified that "Save" and "Save As..." were not working in Word. Word would just hang and only Task Mangler could shut it down. I carry the Sysinternals utilities on CD and USB key, so I rebooted and ran FILEMON, REGMON, and PROCEXP to see what was happening when I tried to save a doc in Word. Sure enough, Word would spawn verclsid.exe as a child process and then hang.
I googled "verclsid" and "Explorer", got nothing on the web and about a dozen Usenet posts from people having the same problem. I played a hunch and renamed verclsid.exe to verclsid.exX. I do that when I'm manually hunting malware that leaves
Problem solved. When the patch for the patch came out, a working verclsid.exe was dropped in %system% and I deleted the
Oh, and the buggy third party shell extension came with a very common HP DeskJet printer. As for Google, the next day I googled "verclsid": there were hundreds of web results and Usenet hits. The day after, tens of thousands. This one bit a lot of people in the ass.
k.
"In spite of everything, I still believe that people are really good at heart." - Anne Frank
Reminds me of a famous story about Jack Welch, former GE CEO. One of the company's division managers made a mistake costing the company $10 million in one quarter. When the quarterly reports came out, he got a call from headquarters telling him to be in Welch's office in NY the next morning. Welch grilled the man for some time, asking him what he was thinking and how he could possibly lose so much money. When it seemed Welch had finished, the manager said he understood that Welch had to fire him now. To which Welch replied, "Why would I fire you when I just invested $10 million in your education?"
This pretty much rendered Windows useless (explorer, file open / save dialogs and the IE7 addressbar were not working) if you had software installed for HP cameras, HP scanners, or any HP DeskJet printer that included a card reader.
h ell Extensions\Cached" /V "{A4DF5659-0801-4A60-9607-1C48695EFDA9} {000214E6-0000-0000-C000-000000000046} 0x401" /T REG_DWORD /F /D 1
Courtesy of JSI FAQ:
You experience one or more of the following strange behaviors:
- You are unable to open special folders, like My Documents or My Pictures.
- Some 3rd party applications hang when accessing My Documents.
- Office files won't open in Microsoft Office if they are stored in My Documents.
- Entering an address into Internet Explorer's address bar does nothing.
- The Send TO context menu has no effect.
- The plus (+) sign on a folder in Windows Explorer does nothing.
- Opening a file via an applications File / Open menu causes the application to hang.
This behavior is caused by a new VERCLSID.EXE binary, which validates shell extensions before Explorer.exe, the Windows Shell, can use them. VERCLSID.EXE is installed by the MS06-015 (908531) security update.
The following 3rd party applications cause VERCLSID.EXE to hang:
Hewlett-Packard's Share-to-Web Namespace Daemon ("%ProgramFiles%\hewlett-packard\hp share-to-web\Hpgs2wnd.exe), auto-started from the Registry Run key and the Startup menu, which ships with:
HP PhotoSmart software
Any HP DeskJet printer that includes a card reader
HP Scanners
Some HP CD-DVD RWs
HP Cameras
Sunbelt Kerio Personal Firewall which has a feature that prompts when Explorer launches VERCLSID.EXE, but you can configure it not to prompt.
To workaround this behavior, add the HP shell extension to the VERCLSID.EXE white list:
1. Open a CMD.EXE window.
2. Type the following command and press Enter:
REG ADD "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\S
3. Shutdown and restart your computer.
NOTE: If you find other COM controls or shell extensions that cause this behavior, you can add them to the white list.
How much open source work have you actually done? I've done a lot, and this idea is one I see very often in people who haven't done any serious API development work before. The approach of attempting to patch every app when an API changes simply doesn't scale. There's a reason all the important open source APIs (gtk, glibc, alsa, X etc) have "gone stable" in the past 5 years, and it's simply a better approach.
Anyway, ignoring the obvious (!!) problems of scaling such an approach, you are confusing two unrelated things. Microsoft can simply/clean up APIs too - they have done it with DirectX and .NET, but that's irrelevant. The problem here is that there are lots of people in the world writing software who perhaps aren't well qualified, and even the ones that are well qualified make mistakes, even with the implementation of quite simple interfaces like IUnknown. I myself have messed up IUnknown before, in fact.
The root problem that caused the hang was attempting to cleanly handle buggy software. This is a common motif in software, hell, it practically motivated the move from the Windows 9x design to the NT fully protected architecture.
Multi-threading is never trivial.
I worked on Wine for a long time, which implements or maps the Win32 API. The complexity of Linux, Windows and MacOS X are all much the same - they are of the same design era, even OS X which is based on lots of older code at its heart. While the more modern parts of the Linux APIs like GTK+ are better than the Win32 equivalents that's just an age thing: the Win32 API has evolved over a much longer period of time. That means it's uglier (the world has learned a lot about API design since the 80s), but it also means there are far more people out there who know it, better tools support, and critically, more apps that use it!
But they only have to maintain source-level compatibility. Microsoft has to maintain binary-level compatibility.
Also, when specific things are extremely and seriously broken, compatibility can be dropped altogether, and some buggy programs broken. Microsoft cannot afford to break buggy programs, even if those are few and far between - nobody can fix them.
Using a multi-threaded approach here, when SMP scalability is not an issue, suggests that either their API design is crap, and requires threading, or that their engineers are incompetent and use threads unnecessarily. Threads are never trivial - but what they were trying to do was quite trivial. Its their fault they involved threads in there.
Compare the complexity of APIs. fork/exec vs CreateProcess. open vs CreateFile. To use either of the Windows ones you have to call multiple APIs with multiple complicated structures documented upon pages and pages of explanations you must do more than skim through.
Windows may have similar complexity in some subsets of its APIs, but the Windows APIs I had the misfortune to use were insanely complicated unnecessarily.
Using a multi-threaded approach here, when SMP scalability is not an issue, suggests that either their API design is crap, and requires threading, or that their engineers are incompetent and use threads unnecessarily. Threads are never trivial - but what they were trying to do was quite trivial. Its their fault they involved threads in there.
This is one of the stupidest comments I've read here in a long time. A secondary "watchdog" thread was employed to enforce a time-out on the helper program's sniffing of a given shell extension, so in case the main thread hung trial hosting a faulty shell extension, there would still be another thread of logic outside of the infinite loop that could run and tell Windows Explorer the result.
If you knew anything about what you're trying to talk about, you'd know that multi-threading is used for these kinds of situations, as well as in GUI programming. And not just "when SMP scalability is an issue". This has nothing to so with the Win32 API design, it just was tackling a very specific problem. It doesn't mean that the Win32 API "requires threading", or that MS's engineers are incompetent, and that they used an additional thread unnecessarily here. Threads can be trivial, and this is I would say actually the most trivial case of their use. It's to their credit that they involved threads here (and might actually have been the only way), and it's to your ignorance that you don't understand any of this and got everything wrong about it.
The flaw was in doing the WaitForSingleObject() in the DLL's detach process function without specifying a timeout value. Even if you have no reason to think that the thread won't be there to signal you eventually, sometimes the unthinkable occurs.
Attention zealots and haters: 00100 00100
The far-simpler approach is to use asynchronous programming. Never use blocking API calls. All good API's always provide non-blocking interfaces.
If long computations are required, split them up into short computations and run the short computations asynchronously from the reactor (event loop).
Firstly, a process is not a thread. It is much safer and cannot cause another process's exit to hang or overwrite its memory/etc. Secondly, I would explicitly declare the interface the plugin must adhere to, and verify that it does, rather than just trying IUknown and watch it for "hanging" via a timeout heuristic.