WindowsPE - Preinstallation Environment. About 200MB in size, it's a fully ROM-bootable Windows system, including limited network (4 connections at once max) and full NTFS support. I've actually got a list somewhere of what Windows APIs it does and does not support.
It used to be you could extract it from early RCs of XP, but that was removed in the final build, and now it's only available to OEMs and corporate partners.
Hah. He took the established ARC format, which had copyrighted free-as-in-beer public domain routines in C, and rewrote them in x86 asm for speed... and then sold PKARC (Phil Katz ARC) as a commercial product. The original inventors of ARC sued him and won - he even kept the same misspellings in the strings, for fuck's sake. He settled for a lump sum in court, then ended up making a couple of changes to the ARC format and renamed it PKZip.
That, and if you actually look at the ZIP format, you'll notice that it's all routines invented by other people. "Shrink" is dynamic LZW, "Reduce" is RLE with a second-pass probabalistic encoder, and "Implode" is a sliding dictionary with post-compression using Huffman/SF-tree encoding.
Katz was an excellent promotor and had good networking skills. I admire him for that much, and for establishing a defacto format that scaled nicely to 64-bit sizes and arbitrary-length Unicode filenames. HOWEVER, he was hardly a pioneer in compression algorithm design. Give him credit where credit is due.
To this day I have a working Z80 system I built myself, including using old TTL chips for all the address decoding, which I recently replaced with a simple FPGA implementation. I'm still comfortable with Z80 and x86 up to the 486s or so, I've written my own bootloaders for x86 boxen (fuck GRUB), I can decode PIC opcodes visually from hex. I've literally counted cycle times down to the microsecond for bitbanging out RS232 and PWM signals on 4MHz PICs. And you're tellin me I don't know what tight code is?:P
*shrug* You and I may like speed... but for the average user microkernels offer more advantages.
You and I are running servers and gaming systems; we want pure performance, and don't mind rebooting a couple of times or recompiling a kernel to change hardware or upgrade drivers. In contrast, my mom has trouble right-clicking My Computer and choosing Properties to get a driver list. For her, a layered driver system that can dynamically load and unload drivers as needed and layer itself against instability is a MAJOR plus, even if it's not for me.
Just because I favor speed over robustness doesn't mean either is intrinsically better - it just means my needs run that way. And, being a programmer, chances are that my needs represent a very tiny fraction of the computer users out there.
Assuming that you get a decent bus-mastering card and that the only other cards on the bus are the NICs (i.e. the video card is AGP) you can probably do 4-way with 33MHz... barely. If you want to do any more than that you should start looking into 64bit transfers.
There's a reason that SGI's unfortunately short-lived NT boxen had a single PCI-64 slot on the Cobalt mobos, and that every configuration sold put a drive controller in there. Those were unbelievably sexy boxes for their day. At a time when the GF2 was a few months old and the 1GHz P3 had just been announced, I was able to load up six windowed 640x480 copies of Q3 and have them botmatch each other in q3dm12. ^_^ Dual 733MHz P3s, 512MB of RAM (of which 192MB was dedicated to framebuffering and texturing... selectable in a graphical BIOS)... too bad they went down the tube along with the rest of SGI.
Re:Correction to Answer
on
Is Mac OS X Slow?
·
· Score: 3, Interesting
Those responsible for the previous corrections have been sacked.:)
Let us not forget that Cocoa can be used from C++ and Carbon from Obj-C - and that you can always just use plain C or C++ and Carbon if your application absolutely cannot waste time on dynamic type checking. I've gotten fond of Cocoa lately, but I'm working on an audio application that needs almost ridicuously low latency, so I have to have fairly fast callbacks - so I'm doing it in Carbon. The extra pain in the GUI is worth the performance for this case, altho it may not be so for all things.
Four-channel ATA-100 RAID-5 cards can be had for under $200 today. Even if you only used one drive per channel and four 70GB drives that's still 210GB of space that can recover from a single drive failure, with solid read speeds and acceptable write speeds. (To recover from two or more drives failing at once means moving to P+Q redundancy, aka RAID-6, and you start moving into price ranges beyond the reach of the average hobbyist.)
Actually, chances are he was referring to the group of four women who mix classical scores to dance beats and recently released a new CD. Their first one, "Born," was outstanding.
Hmm... nice overflow. For a 32-bit signed int, that'd be (2^32)-2 mod points, right?;)
Just think... 4 billion mod points... one could mod down as "Underrated" every comment ever made on every story Slashdot has ever run, and still have plently left over to mod up goatse links, since he'd never get hit for it in M2 either...
I don't know what they use internally... but the Developer Tools on the third Jaguar CD use the jam project manager (make alternative) to shell out to GCC for compiling. All the compiler options are available. In fact, there's even an option to choose between gcc-2.95 and the 3.1 beta.
It also wouldn't make much sense to use different IDEs for different APIs. Cocoa can be used with C++, and Carbon with Objective-C, you know. They're just different APIs, one that takes advantage of OO and dynamic typing wherever possible, the other that takes a more fragmented but backwards-compatible approach. Personally, I like both of them; I'm still more comfortable with C++ though than with Obj-C, so I use Carbon more.
(I used to hate Obj-C, mostly because it does dynamic type checking out the ying-yang and I'm an old Z80/8088 optimization hack, but I'm warming up to it now. It's somewhat like mixing C and Smalltalk, easy to learn but at the cost of orthogonality.)
From another guy with a foot in both camps - about the only thing that's proprietary is the mobo and CPU, both of which you can buy individually these days.
A friend of mine recently built a system himself entirely homebrew - bought a gigabit-enabled mobo and an 800MHz G4 online, and reflashed a GF2MX with a Mac BIOS. Add an IDE HDD and DVD-R drive and a plain-jane ATX PSU, hit the power button, install Jaguar. Tada. I'm running a slightly overclocked Beige G3 on loan from him.
Except for the fact that the retard submitted his own site. If you mouseover "Anonymous Coward" it's the same link as the host.
Now, that's an idea... take out fire insurance on a house, put a webserver in a wooden desk, and then submit a link to Slashdot with any of the following words in the summary:
Doesn't matter if the page on the server actually has to do with any of the above, just put 'em in the summary. Better yet, do several of them - for example, "CowboyNeal and Torvalds post log of their sordid affair online under GPL license (homosexual love diaries should be free as in speech, not as in beer); however, it may be in violation of the DMCA, since Richard Simmons owns a related patent..."
Tada! Instant money! The only question is, if you got caught, would Malda and company be liable for arson?
It needs 2n + 1 qubits; you start with a superposition, raise it to a power, then measure the result, collapsing the first superposition into a subset of logarithms. The discrete log step is the clincher: once you know the number has a log, you can just perform a Fourier transform on the superposition of logs, and the rest is all number theory.
And yes, you realistically need a LOT of extra qubits for error-correcting codes.
(Just for completeness, the University of Portland used this text for a 400-level semester course on QC. It's not too bad, although it expects you to be quite fluent in number theory and linear algebra.)
Very true. However, keep in mind that it's not really an exploit to do this. We would like to hope that we could prevent anyone from changing its parameters except us, but short of subclassing a text box, we can't do that.
(Well, it actually wouldn't be too bad. We write a single WndProc that discards all messages except the essentials for operation (and NOT setup) and then call SetWindowLong after every initialization to "lock" it, call it again to unlock it with the default WndProc from the TEXT atom.)
Also consider that WM_PASTE is being issued from the textbox, and it's the textbox wndproc that's doing a DDE copy (or whatever they use these days for transfer in 2K/XP) from the clipboard to the text box's internal buffer. That's entirely client side.
The real vulnerability in this exploit is the lack of sanity checking on DefWindowProc's WM_TIMER handler, not the text box, since you could still look up the shellcode in the text box's buffer without having to copy it into your own. The text box is just one of the many unfortunate ways it could be abused, and a decent example of the more general problem of insecurity in the message dispatch design. However, there's no way to fix the general problem without breaking a number of apps - and we also can't really dismiss said apps as legacy, either. There's also a problem in that Windows doesn't take advantage of the x86's ability to mark blocks of memory as non-executable... but that's a whole other rant.
The only way I can see to prevent such abuses is to write a homebrew set of widgets that use genuine transactions to perform operations, providing security at the cost of performance. But then, if you allow the hacker to have a debugger installed on the system and ready to go, then you're also likely allowing him to replace the USER32 and KERNEL32 libraries with dummied ones, and really, nothing is safe.
Same here but nobody a range check for *every* single text box you have on every *keypress* event. It's not even an industry standard to do that. That makes this buffer overflow is far more subtle than most.
Of course not - but if my keypress event involves copying text into a buffer, I'll be damned if I do it blindly. Doesn't mean I need to range check every single box, just the one that's being changed (and hence needing to be recopied).
"Not an industry standard" is hardly a valid excuse. You don't need to be paranoid on every single app and event you write, but "checking a buffer you're about to write into" is hardly some guru concept, it's damn near common sense.
1. If the program creates an edit control and specifies that the maximum input size is X, then it better stay X until it is changed by authorized code- namely the process that created it.
I'd agree with you. Practically, Windows doesn't provide a mechanism for finding the originating thread of a message, and I doubt they'll hack one in now.
The exploit code is most likely in the edit control's internal buffer- not the application's dynamic memory. Unless the user clicks OK or some sort of OnChanged handler has copied that data to a new location, the application has not had the chance to touch it. The attacker merely needs to guess the location of the edit control's internal buffer. Apparently, it's less difficult than one would hope.
The child window's memory is still inside the parent's process, on the heap. It's actually quite difficult to locate algorithmically, though - he admits in the article that his technique relies on attaching a debugger to the process and searching system memory byte-by-byte for the string FOON.
That, and he mentions a buffer overflow, so unless he's using terms he doesn't understand and assumed that happened, the textbox is sending an EM_CHANGED message to its parent window. In reality I think his debugger's just finding it in the textbox and executing out out of there. See next para.
Now, I have a question. How is that this code can be executed at all? Unless the permissions on the pages holding the exploit code permit execution, I don't see how the exploit can even run. Instead, I would expect to see some sort of access violation. Apparently not.
None of the current Windows handles clear the execute bit in the GDT/LDT tables. Ya, I think it's stupid too.
Re:What a load of tripe.
on
Shattering Windows
·
· Score: 3, Informative
It is a Win32 vulnerability in that it is even possible to have this kind of elevation of priveleges by "poorly written" services. Basically, any system service that interacts with the desktop is vulnerable (virus scanners). McAfee interacts with the desktop. This is bad too.
By the same token, is it a UNIX or Linux vulnerability if I run an insecure daemon suid root and someone buffer overflows it to get root privs?
It does, it has a set a maxlength of 4. Unfortunately, the vulnerability allows this to be bypassed. I'm sure the GUI will complain when you press the OK button but since we never get that far, the app never gets a chance to check the length change.
I'd disagree with you there. Viruscan's assuming that nobody will fuck with its code (the worst mistake you can make when dealing with crackers) and that it'll never get more than 4 characters. Maybe it's just me, but I automatically check length whenever I do ANYTHING with a buffer, and usually I make a quick call to IsBadWritePtr(buf, len) or IsBadCodePtr(callbk) while I'm at it.
Yes, technically you can't prevent him from changing the text box's maxlength, short of subclassing the text box with a custom WndProc after initializing that discards said messages. But that doesn't mean you can't write your code in a robust manner.
It's too much to ask any app that doesn't use WM_TIMER to put in this handler in all their apps. Further, I bet WM_TIMER isn't the only message he could use to trigger code.
I grant you that - MSFT fucked up by writing DefWindowProc() to blindly execute a callback. However, while tricky, I don't think it's fair to say that it's an unfixable problem, or that this is any different from some of the assumptions made in glibc's functions.
Fortunately, the community tends to be able to fix these things by patching the source. There is no fix for this problem.
Yep.
No no no... you've cheated. As a system service, you already have priveleges
I was being sarcastic by that point. ^_^
Re:What a load of tripe.
on
Shattering Windows
·
· Score: 4, Informative
Every window's internal structure includes a pointer to a function called a WndProc. The function reacts to messages, and effectively defines how and what that window does.
The problem is that the vast majority of apps out there look like this, mainly since Microsoft uses this structure in their own books and help files:
switch(message_type) { case WM_CREATE:... case WM_LBUTTONDOWN:... etc... default:
return DefWindowProc(hwnd, iMsgType, lParam, wParam); }
Microsoft provides a moderately reasonable default procedure to fall out to, so that programmers don't have to write a case for every single one of the myriad standard messages.
The problem is that DefWindowProc doesn't check the validity of a code pointer before executing it, it just automatically jumps to any function specified in a WM_TIMER message.
So, without changing that function, any program that doesn't explicitly check for WM_TIMER is vulnerable... if there is a way to insert shellcode into the process' memory space. The exploit in the article can do it because Viruscan doesn't sanity-check their text boxes, so he can paste up to 4GB of material into the box, and tada.
Yes - but plenty of apps out there use WM_TIMER. You can't just discard the whole message, and since message hooks don't give you a handle for the sending thread, there's no way to tell whether the timer is a legitimate call from the app or a backdoor.
You're technically correct, but that still doesn't solve the problem at hand.
Not even really that - Hooks aren't allowed to modify messages, only watch them.
The only way I can see to defend against this is to carefully initialize all windows in the system before starting the message pump, using careful calls to GetMessage() with range filtering and checking the wndproc pointer at every step to make sure nobody's subclassed it from under your nose. Then, subclass all child windows to ignore EM_SETSIZE, WM_SETWINDOWTEXT, any non-essential messages. It's hardly a trivial task, though, and it'd have to be done per-app, in which case re-writing the code to not call DefWindowProc on WM_TIMER or any other callback message would be cheaper.
That, and the whole thing becomes moot if I create two passthrough DLLs, user32 and kernel32, the former patched with a handy dandy backdoor, the latter with a patched LoadLibrary() call to prevent apps from loading the unpatched versions directly out of the system directory.
I'm really, really disgusted that this even got posted. This isn't a Win32 vulnerability, it's a Virusscan vuln. (Watch my karma burn, I'm actually defending MSFT... but hear me out.)
For those of you who aren't familiar with Windows programming, here's what he's doing. Viruscan's GUI is very poorly written and doesn't check for a maximum length on a text box's input. So, he adjusts the size of a textbox using an outside program to 4GB. (Windows unfortunately allows this, since the message format doesn't include a "sender" field to check against the owner handle.) He then inserts shellcode in it, attaches a debugger to the process and searches all of memory for the start of the shellcode. Real efficient, this one.
He then sends it a WM_TIMER message to trigger it. WM_TIMER is usually sent to your window on a regular interval when you've called SetTimer(), and contains either an integral ID or a pointer to a callback in memory. So, he sends it a fake WM_TIMER, and Viruscan executes the callback blindly.
You know what, I use WM_TIMER too in my apps - but, there's two simple ways to defend against it.
if ((void *) msg.lparam != known_cb_address) { return false; }
if (0 != IsBadCodePtr((FARPROC) msg.lparam)) { go_fuck_yourself(); }
And if I'm not using it, special-case it so that it doesn't fall through to DefWindowProc().
Seriously, all this guy is doing is buffer overflowing a poorly written program to get Administrator privs. That's like claiming that glibc is insecure and should be thrown out because it has sprintf() or gets(). Ya know, I can buffer overflow a poorly written suid app too, but that doesn't make the libc to blame, nor have we published articles lambasting the GNU Project for not putting bounds checking into those functions.
This guy's just trying to sell himself, and you guys were more than helpful. Maybe I should write a system service that subclasses MSIE's WndProc with a single function that calls ExitProcess(1), and see if Slashdot will find me a security job.
There's two big things a quantum computer can do effortlessly that a normal one can't. (This doesn't mean a normal one can't emulate one, mind.) The first is superposition - a n-bit register can hold multiple values at once. In fact, more than that - it can hold them in unequal proportions.
The odd bit about a superposition is that if you measure it, it'll "collapse" into one of its possible values according to probability, after which it's identical to that value. 100% probability.
The other thing a QC can do is entanglement. In theory any quantum system is entangled, but when people talk about it with respect to algorithms, they mean this: if you run an algorithm on a superposition of values, you'll get back out another superposition. However, the input and output registers are now "entangled" and any change to one will be instantaneously reflected in the other.
Shor's algorithm uses this last property to effectively do operators in reverse. One way you can factor a number is by finding out if its log in a given base is 0 in a certain modulo. So, you just create a superposition of all possible n-bit numbers, raise them to that base, then measure the result. A certain subset of those numbers will have a integral log - when you measure the results and get a single log, the source (which was previously all n-bit #s) will then become all n-bit numbers that have a log base N.
(This is all gross oversimplification, but it's still pretty accurate. To be fully accurate I'd have to start writing down tensor products of matrices, and well, Slashdot sucks.;) )
Shor's algorithm requires a bit of number theory to prove its correctness, but the first part is the important one. You need 2n+1 qubits to factor a n-bit number - two n-bit registers and a keeper bit. (Note, I'm running from memory here, not my notes, so I may have a step wrong.)
You initialize the first with the Hadamard transform, creating a superposition of all possible n-bit numbers. You then raise that superposition to the power of your number to factor, modulo 2^n, and store the result in the second register, which will itself be a superposition (and entangled with the first register). You then measure the second register - and as long as it doesn't measure to be zero, its collapse will trigger a partial collapse in the first register, resulting in a set of bases which are congruent modulo the second register's collapsed result. You then perform a discrete Fourier transform on the first register, and the rest is all logic and repetition.
Actually, you need double the number of bits just for a basic implementation of Shor's algorithm. It works by starting with a superposition of numbers and taking the log of that, then collapsing the entangled result, which forces the original superposition into a subset of reverse logs. The reverse log problem's one of the hardest in computer science, and solving that has the nice side effect of making factorization trivial.
In practice you'd need CONSIDERABLY more than 2048 bits. A superposition of numbers is really difficult to keep stable because of noise leaking from the outside world into the system - in practice you need extra bits for error correcting codes.
WindowsPE - Preinstallation Environment. About 200MB in size, it's a fully ROM-bootable Windows system, including limited network (4 connections at once max) and full NTFS support. I've actually got a list somewhere of what Windows APIs it does and does not support.
It used to be you could extract it from early RCs of XP, but that was removed in the final build, and now it's only available to OEMs and corporate partners.
Hah. He took the established ARC format, which had copyrighted free-as-in-beer public domain routines in C, and rewrote them in x86 asm for speed... and then sold PKARC (Phil Katz ARC) as a commercial product. The original inventors of ARC sued him and won - he even kept the same misspellings in the strings, for fuck's sake. He settled for a lump sum in court, then ended up making a couple of changes to the ARC format and renamed it PKZip.
That, and if you actually look at the ZIP format, you'll notice that it's all routines invented by other people. "Shrink" is dynamic LZW, "Reduce" is RLE with a second-pass probabalistic encoder, and "Implode" is a sliding dictionary with post-compression using Huffman/SF-tree encoding.
Katz was an excellent promotor and had good networking skills. I admire him for that much, and for establishing a defacto format that scaled nicely to 64-bit sizes and arbitrary-length Unicode filenames. HOWEVER, he was hardly a pioneer in compression algorithm design. Give him credit where credit is due.
To this day I have a working Z80 system I built myself, including using old TTL chips for all the address decoding, which I recently replaced with a simple FPGA implementation. I'm still comfortable with Z80 and x86 up to the 486s or so, I've written my own bootloaders for x86 boxen (fuck GRUB), I can decode PIC opcodes visually from hex. I've literally counted cycle times down to the microsecond for bitbanging out RS232 and PWM signals on 4MHz PICs. And you're tellin me I don't know what tight code is? :P
*shrug* You and I may like speed... but for the average user microkernels offer more advantages.
You and I are running servers and gaming systems; we want pure performance, and don't mind rebooting a couple of times or recompiling a kernel to change hardware or upgrade drivers. In contrast, my mom has trouble right-clicking My Computer and choosing Properties to get a driver list. For her, a layered driver system that can dynamically load and unload drivers as needed and layer itself against instability is a MAJOR plus, even if it's not for me.
Just because I favor speed over robustness doesn't mean either is intrinsically better - it just means my needs run that way. And, being a programmer, chances are that my needs represent a very tiny fraction of the computer users out there.
Assuming that you get a decent bus-mastering card and that the only other cards on the bus are the NICs (i.e. the video card is AGP) you can probably do 4-way with 33MHz... barely. If you want to do any more than that you should start looking into 64bit transfers.
There's a reason that SGI's unfortunately short-lived NT boxen had a single PCI-64 slot on the Cobalt mobos, and that every configuration sold put a drive controller in there. Those were unbelievably sexy boxes for their day. At a time when the GF2 was a few months old and the 1GHz P3 had just been announced, I was able to load up six windowed 640x480 copies of Q3 and have them botmatch each other in q3dm12. ^_^ Dual 733MHz P3s, 512MB of RAM (of which 192MB was dedicated to framebuffering and texturing... selectable in a graphical BIOS)... too bad they went down the tube along with the rest of SGI.
Those responsible for the previous corrections have been sacked. :)
Let us not forget that Cocoa can be used from C++ and Carbon from Obj-C - and that you can always just use plain C or C++ and Carbon if your application absolutely cannot waste time on dynamic type checking. I've gotten fond of Cocoa lately, but I'm working on an audio application that needs almost ridicuously low latency, so I have to have fairly fast callbacks - so I'm doing it in Carbon. The extra pain in the GUI is worth the performance for this case, altho it may not be so for all things.
Four-channel ATA-100 RAID-5 cards can be had for under $200 today. Even if you only used one drive per channel and four 70GB drives that's still 210GB of space that can recover from a single drive failure, with solid read speeds and acceptable write speeds. (To recover from two or more drives failing at once means moving to P+Q redundancy, aka RAID-6, and you start moving into price ranges beyond the reach of the average hobbyist.)
Actually, chances are he was referring to the group of four women who mix classical scores to dance beats and recently released a new CD. Their first one, "Born," was outstanding.
-1 too many mod points for editors.
;)
Hmm... nice overflow. For a 32-bit signed int, that'd be (2^32)-2 mod points, right?
Just think... 4 billion mod points... one could mod down as "Underrated" every comment ever made on every story Slashdot has ever run, and still have plently left over to mod up goatse links, since he'd never get hit for it in M2 either...
I don't know what they use internally... but the Developer Tools on the third Jaguar CD use the jam project manager (make alternative) to shell out to GCC for compiling. All the compiler options are available. In fact, there's even an option to choose between gcc-2.95 and the 3.1 beta.
It also wouldn't make much sense to use different IDEs for different APIs. Cocoa can be used with C++, and Carbon with Objective-C, you know. They're just different APIs, one that takes advantage of OO and dynamic typing wherever possible, the other that takes a more fragmented but backwards-compatible approach. Personally, I like both of them; I'm still more comfortable with C++ though than with Obj-C, so I use Carbon more.
(I used to hate Obj-C, mostly because it does dynamic type checking out the ying-yang and I'm an old Z80/8088 optimization hack, but I'm warming up to it now. It's somewhat like mixing C and Smalltalk, easy to learn but at the cost of orthogonality.)
From another guy with a foot in both camps - about the only thing that's proprietary is the mobo and CPU, both of which you can buy individually these days.
A friend of mine recently built a system himself entirely homebrew - bought a gigabit-enabled mobo and an 800MHz G4 online, and reflashed a GF2MX with a Mac BIOS. Add an IDE HDD and DVD-R drive and a plain-jane ATX PSU, hit the power button, install Jaguar. Tada. I'm running a slightly overclocked Beige G3 on loan from him.
Except for the fact that the retard submitted his own site. If you mouseover "Anonymous Coward" it's the same link as the host.
Now, that's an idea... take out fire insurance on a house, put a webserver in a wooden desk, and then submit a link to Slashdot with any of the following words in the summary:
GPL, BSD, RMS, Microsoft, Linux, Torvalds, Gates, Lego, DMCA, RIAA, CowboyNeal
Doesn't matter if the page on the server actually has to do with any of the above, just put 'em in the summary. Better yet, do several of them - for example, "CowboyNeal and Torvalds post log of their sordid affair online under GPL license (homosexual love diaries should be free as in speech, not as in beer); however, it may be in violation of the DMCA, since Richard Simmons owns a related patent..."
Tada! Instant money! The only question is, if you got caught, would Malda and company be liable for arson?
It needs 2n + 1 qubits; you start with a superposition, raise it to a power, then measure the result, collapsing the first superposition into a subset of logarithms. The discrete log step is the clincher: once you know the number has a log, you can just perform a Fourier transform on the superposition of logs, and the rest is all number theory.
And yes, you realistically need a LOT of extra qubits for error-correcting codes.
(Just for completeness, the University of Portland used this text for a 400-level semester course on QC. It's not too bad, although it expects you to be quite fluent in number theory and linear algebra.)
Very true. However, keep in mind that it's not really an exploit to do this. We would like to hope that we could prevent anyone from changing its parameters except us, but short of subclassing a text box, we can't do that.
(Well, it actually wouldn't be too bad. We write a single WndProc that discards all messages except the essentials for operation (and NOT setup) and then call SetWindowLong after every initialization to "lock" it, call it again to unlock it with the default WndProc from the TEXT atom.)
Also consider that WM_PASTE is being issued from the textbox, and it's the textbox wndproc that's doing a DDE copy (or whatever they use these days for transfer in 2K/XP) from the clipboard to the text box's internal buffer. That's entirely client side.
The real vulnerability in this exploit is the lack of sanity checking on DefWindowProc's WM_TIMER handler, not the text box, since you could still look up the shellcode in the text box's buffer without having to copy it into your own. The text box is just one of the many unfortunate ways it could be abused, and a decent example of the more general problem of insecurity in the message dispatch design. However, there's no way to fix the general problem without breaking a number of apps - and we also can't really dismiss said apps as legacy, either. There's also a problem in that Windows doesn't take advantage of the x86's ability to mark blocks of memory as non-executable... but that's a whole other rant.
The only way I can see to prevent such abuses is to write a homebrew set of widgets that use genuine transactions to perform operations, providing security at the cost of performance. But then, if you allow the hacker to have a debugger installed on the system and ready to go, then you're also likely allowing him to replace the USER32 and KERNEL32 libraries with dummied ones, and really, nothing is safe.
Same here but nobody a range check for *every* single text box you have on every *keypress* event. It's not even an industry standard to do that. That makes this buffer overflow is far more subtle than most.
Of course not - but if my keypress event involves copying text into a buffer, I'll be damned if I do it blindly. Doesn't mean I need to range check every single box, just the one that's being changed (and hence needing to be recopied).
"Not an industry standard" is hardly a valid excuse. You don't need to be paranoid on every single app and event you write, but "checking a buffer you're about to write into" is hardly some guru concept, it's damn near common sense.
1. If the program creates an edit control and specifies that the maximum input size is X, then it better stay X until it is changed by authorized code- namely the process that created it.
I'd agree with you. Practically, Windows doesn't provide a mechanism for finding the originating thread of a message, and I doubt they'll hack one in now.
The exploit code is most likely in the edit control's internal buffer- not the application's dynamic memory. Unless the user clicks OK or some sort of OnChanged handler has copied that data to a new location, the application has not had the chance to touch it. The attacker merely needs to guess the location of the edit control's internal buffer. Apparently, it's less difficult than one would hope.
The child window's memory is still inside the parent's process, on the heap. It's actually quite difficult to locate algorithmically, though - he admits in the article that his technique relies on attaching a debugger to the process and searching system memory byte-by-byte for the string FOON.
That, and he mentions a buffer overflow, so unless he's using terms he doesn't understand and assumed that happened, the textbox is sending an EM_CHANGED message to its parent window. In reality I think his debugger's just finding it in the textbox and executing out out of there. See next para.
Now, I have a question. How is that this code can be executed at all? Unless the permissions on the pages holding the exploit code permit execution, I don't see how the exploit can even run. Instead, I would expect to see some sort of access violation. Apparently not.
None of the current Windows handles clear the execute bit in the GDT/LDT tables. Ya, I think it's stupid too.
It is a Win32 vulnerability in that it is even possible to have this kind of elevation of priveleges by "poorly written" services. Basically, any system service that interacts with the desktop is vulnerable (virus scanners). McAfee interacts with the desktop. This is bad too.
By the same token, is it a UNIX or Linux vulnerability if I run an insecure daemon suid root and someone buffer overflows it to get root privs?
It does, it has a set a maxlength of 4. Unfortunately, the vulnerability allows this to be bypassed. I'm sure the GUI will complain when you press the OK button but since we never get that far, the app never gets a chance to check the length change.
I'd disagree with you there. Viruscan's assuming that nobody will fuck with its code (the worst mistake you can make when dealing with crackers) and that it'll never get more than 4 characters. Maybe it's just me, but I automatically check length whenever I do ANYTHING with a buffer, and usually I make a quick call to IsBadWritePtr(buf, len) or IsBadCodePtr(callbk) while I'm at it.
Yes, technically you can't prevent him from changing the text box's maxlength, short of subclassing the text box with a custom WndProc after initializing that discards said messages. But that doesn't mean you can't write your code in a robust manner.
It's too much to ask any app that doesn't use WM_TIMER to put in this handler in all their apps. Further, I bet WM_TIMER isn't the only message he could use to trigger code.
I grant you that - MSFT fucked up by writing DefWindowProc() to blindly execute a callback. However, while tricky, I don't think it's fair to say that it's an unfixable problem, or that this is any different from some of the assumptions made in glibc's functions.
Fortunately, the community tends to be able to fix these things by patching the source. There is no fix for this problem.
Yep.
No no no... you've cheated. As a system service, you already have priveleges
I was being sarcastic by that point. ^_^
Every window's internal structure includes a pointer to a function called a WndProc. The function reacts to messages, and effectively defines how and what that window does.
... ...
The problem is that the vast majority of apps out there look like this, mainly since Microsoft uses this structure in their own books and help files:
switch(message_type)
{
case WM_CREATE:
case WM_LBUTTONDOWN:
etc...
default:
return DefWindowProc(hwnd, iMsgType, lParam, wParam);
}
Microsoft provides a moderately reasonable default procedure to fall out to, so that programmers don't have to write a case for every single one of the myriad standard messages.
The problem is that DefWindowProc doesn't check the validity of a code pointer before executing it, it just automatically jumps to any function specified in a WM_TIMER message.
So, without changing that function, any program that doesn't explicitly check for WM_TIMER is vulnerable... if there is a way to insert shellcode into the process' memory space. The exploit in the article can do it because Viruscan doesn't sanity-check their text boxes, so he can paste up to 4GB of material into the box, and tada.
Yes - but plenty of apps out there use WM_TIMER. You can't just discard the whole message, and since message hooks don't give you a handle for the sending thread, there's no way to tell whether the timer is a legitimate call from the app or a backdoor.
You're technically correct, but that still doesn't solve the problem at hand.
Not even really that - Hooks aren't allowed to modify messages, only watch them.
The only way I can see to defend against this is to carefully initialize all windows in the system before starting the message pump, using careful calls to GetMessage() with range filtering and checking the wndproc pointer at every step to make sure nobody's subclassed it from under your nose. Then, subclass all child windows to ignore EM_SETSIZE, WM_SETWINDOWTEXT, any non-essential messages. It's hardly a trivial task, though, and it'd have to be done per-app, in which case re-writing the code to not call DefWindowProc on WM_TIMER or any other callback message would be cheaper.
That, and the whole thing becomes moot if I create two passthrough DLLs, user32 and kernel32, the former patched with a handy dandy backdoor, the latter with a patched LoadLibrary() call to prevent apps from loading the unpatched versions directly out of the system directory.
I'm really, really disgusted that this even got posted. This isn't a Win32 vulnerability, it's a Virusscan vuln. (Watch my karma burn, I'm actually defending MSFT... but hear me out.)
For those of you who aren't familiar with Windows programming, here's what he's doing. Viruscan's GUI is very poorly written and doesn't check for a maximum length on a text box's input. So, he adjusts the size of a textbox using an outside program to 4GB. (Windows unfortunately allows this, since the message format doesn't include a "sender" field to check against the owner handle.) He then inserts shellcode in it, attaches a debugger to the process and searches all of memory for the start of the shellcode. Real efficient, this one.
He then sends it a WM_TIMER message to trigger it. WM_TIMER is usually sent to your window on a regular interval when you've called SetTimer(), and contains either an integral ID or a pointer to a callback in memory. So, he sends it a fake WM_TIMER, and Viruscan executes the callback blindly.
You know what, I use WM_TIMER too in my apps - but, there's two simple ways to defend against it.
if ((void *) msg.lparam != known_cb_address)
{
return false;
}
if (0 != IsBadCodePtr((FARPROC) msg.lparam))
{
go_fuck_yourself();
}
And if I'm not using it, special-case it so that it doesn't fall through to DefWindowProc().
Seriously, all this guy is doing is buffer overflowing a poorly written program to get Administrator privs. That's like claiming that glibc is insecure and should be thrown out because it has sprintf() or gets(). Ya know, I can buffer overflow a poorly written suid app too, but that doesn't make the libc to blame, nor have we published articles lambasting the GNU Project for not putting bounds checking into those functions.
This guy's just trying to sell himself, and you guys were more than helpful. Maybe I should write a system service that subclasses MSIE's WndProc with a single function that calls ExitProcess(1), and see if Slashdot will find me a security job.
It's actually a neat way of doing it.
;) )
There's two big things a quantum computer can do effortlessly that a normal one can't. (This doesn't mean a normal one can't emulate one, mind.) The first is superposition - a n-bit register can hold multiple values at once. In fact, more than that - it can hold them in unequal proportions.
The odd bit about a superposition is that if you measure it, it'll "collapse" into one of its possible values according to probability, after which it's identical to that value. 100% probability.
The other thing a QC can do is entanglement. In theory any quantum system is entangled, but when people talk about it with respect to algorithms, they mean this: if you run an algorithm on a superposition of values, you'll get back out another superposition. However, the input and output registers are now "entangled" and any change to one will be instantaneously reflected in the other.
Shor's algorithm uses this last property to effectively do operators in reverse. One way you can factor a number is by finding out if its log in a given base is 0 in a certain modulo. So, you just create a superposition of all possible n-bit numbers, raise them to that base, then measure the result. A certain subset of those numbers will have a integral log - when you measure the results and get a single log, the source (which was previously all n-bit #s) will then become all n-bit numbers that have a log base N.
(This is all gross oversimplification, but it's still pretty accurate. To be fully accurate I'd have to start writing down tensor products of matrices, and well, Slashdot sucks.
What happens if you don't know the base? ^_^
Shor's algorithm requires a bit of number theory to prove its correctness, but the first part is the important one. You need 2n+1 qubits to factor a n-bit number - two n-bit registers and a keeper bit. (Note, I'm running from memory here, not my notes, so I may have a step wrong.)
You initialize the first with the Hadamard transform, creating a superposition of all possible n-bit numbers. You then raise that superposition to the power of your number to factor, modulo 2^n, and store the result in the second register, which will itself be a superposition (and entangled with the first register). You then measure the second register - and as long as it doesn't measure to be zero, its collapse will trigger a partial collapse in the first register, resulting in a set of bases which are congruent modulo the second register's collapsed result. You then perform a discrete Fourier transform on the first register, and the rest is all logic and repetition.
Actually, you need double the number of bits just for a basic implementation of Shor's algorithm. It works by starting with a superposition of numbers and taking the log of that, then collapsing the entangled result, which forces the original superposition into a subset of reverse logs. The reverse log problem's one of the hardest in computer science, and solving that has the nice side effect of making factorization trivial.
In practice you'd need CONSIDERABLY more than 2048 bits. A superposition of numbers is really difficult to keep stable because of noise leaking from the outside world into the system - in practice you need extra bits for error correcting codes.
You could serrate the edge of the rule and use it to drive the gear of a small DC generator in the slide... but man, that could get nasty. ;)