Perhaps for a very specific definition of "swap", you might be correct, but swap has a more general meaning. Your definition seems to be full process swapping (where swap = suspend process, send all code and data to disk). That certainly is pretty rare, and is generally a last resort. Since it is so rare, the verbs "page" and "swap" generally get conflated and discussed together. On most systems (pretty much all systems with virtual memory support), swapping/paging definitely happens way before memory fills completely, as you could see using any system monitor like top (Unix) or TaskMan (Windows).
No. "Enough ram for file caching" is approximately infinite RAM, so the premise is flawed. If you did somehow have "enough" RAM that there was no noticable benefit from having more file cache, then adding a swap file would make no difference one way or another -- the system simply wouldn't ever swap anything out. (You might still see some of the swap file as "in use", but that's just the OS doing agressive optimization. The OS will copy data from RAM to disk during idle cycles. It does this just in case there's a sudden increase in memory requests. If data has already been sent to the swap file and the OS later decides to give the RAM to some other program, the other program doesn't have to wait for the RAM to be copied to disk since it's already there -- the memory just has to be zeroed out and it is ready to reassign to the new program. But if the data is needed before some other program needs the memory, the data is in RAM ready for use, and the copy on disk doesn't hurt anything.)
In any case, it's generally best to let the OS pick what stays in RAM based on what is used most often. If you're using files more than apps, your file cache will grow at the expense of RAM reserved for apps. If you're using apps more than files, your file cache will shrink. Removing swap from the equation simply means that you've made it impossible for the OS to do the best job of optimizing your system's performance if it gets low on free RAM -- instead of keeping stuff in RAM based on what gets used most often, it has to keep stuff in RAM because it has nowhere else to put it.
That would be the case if we were trying to focus atomic forces or electrons. Instead, since we're trying to focus certain wavelengths of light, only differences that are detectable with those wavelengths of light will imact the result.
Yup. Drivers have the ability to provide access to anybody they want. It is the responsibility of the driver's author to determine whether or not to grant "update firmware" access to a particular account. You are correct that the driver should not be doing this.
Actually, depending on how you look at it, you could consider the Vista audio driver architecture to be much simpler. Some pieces have been separated out, but they are simpler pieces.
Creative is all up in arms because Vista doesn't have as many hooks for "audio acceleration" as XP did. But you know where that got us -- I can't count the number of times my system crashed in the audio driver, and Creative's drivers were among the worst. Microsoft got fed up with the number of crashes in the audio drivers (all blamed on MS, of course) and decided to make a simple, solid architecture. Maybe in the future they'll put some hooks back in, but for now, the driver doesn't have much to do, and that means that in many cases, all that fancy hardware doesn't have anything to do. Fine by me - all I want is to get the audio data out to my speakers reliably without crashing my system.
Anyway, the basic idea is this:
1. User mode app talks to the audio API (a DLL loaded into the App's process). The API talks to the audio service (which runs in User mode) and requests a "connection". The audio service allocates some memory and shares it with the app.
2. From this point on, most of the communication between the app and the audio service is done via the shared buffers. This is nice because there is no context switching or kernel-mode call needed to fill the buffer. The app loads the shared memory with say 10 ms worth of audio data, and the service mixes that data with the data from any other apps and puts the mixed data into a buffer. All effects and mixing is done in user mode.
3. The audio driver knows where the audio service is putting the mixed audio data. It picks up the data directly from that buffer. Again, no context switching is needed and no data needs to be copied from process to process or user mode to kernel mode. No mixing or audio processing is done in kernel mode. All the driver does is initialize the audio card and get the data from the buffer to the audio card.
The result of all of this is an audio stack where all of the complex code is provided by Microsoft, is the same code path for all audio cards, has almost no overhead (buffer copying or context switching), etc. The only issue is that the mixing and effects all need to be done in software (not a big deal - mixing and typical audio effects take much less than 1% of a modern processor's time) and that the audio service and kernel mode driver need to be reliably scheduled to run once every 10 ms or so (not a big deal if your hardware and drivers follow the ACPI rules, though I've seen some issues where a laptop motherboard is too aggressive about trying to go into deep sleep and is unable to wake up quickly enough and it misses some deadlines causing audio pops).
Yeah, I wasn't distinguishing between "DX10 mode" and "DX10 hardware" in my post. Sure, anything with new drivers and any acceleration will run in Aero, but very little DX10 hardware support is out there yet.
My laptop uses an ATI integrated graphics chip, but your analysis is otherwise accurate.
Both kinds of full screen cause trouble. The best playback method is for me to use Media Player Classic and choose... oh, I forget exactly which output method it is... one of them is incompatible with desktop composition, so it is kind of an easy way to get Vista to automatically downgrade me when I start playing something.
DX 10 is not designed to force anybody to do anything. It was a big change in the way DirectX works so it required significant changes in the kernel's video system and significant changes in the structure of video drivers. That kind of thing is really hard to stuff into a service pack.
I think that in the long term, the change (moving to the Vista video architecture) will be a good thing. The Vista video model seems to address a lot of real issues like sharing the 3D features of the video card (previously not a real possibility). In the short term, the change is a bit painful and offers no real benefit (just nifty eye candy and effects). If I were a game developer, I certainly wouldn't develop any games that only run on DX10.
I don't think that is entirely unexpected -- most developers still support DX8. However, just like most developers can expect most of their gamers to have DX9 hardware and software, eventually developers will be able to expect gamers to have DX10 hardware and software. Then there will be benefits.
In the meantime, I can understand some frustration. For example, due to my laptop's lousy video driver, I can't play full-screen video in DX10 (Aero transparency enabled) mode. However, if I switch to the "Basic" mode, suddenly all is well. So this is certainly painful.
Certain parts of network handling occur in the kernel with scheduling disabled, which means that they have the potential to cause glitches in the audio if too many packets are handled per second. Vista tries to provide low-latency audio, so it has less tolerance for delays than previous versions of Windows. So the audio guys talked to the networking guys and decided there would be a flag that says "please reserve some time for audio processing if an audio stream is being processed". Unfortunately, the reservation was way too agressive.
No excuses - this is a bad bug and should never have happened. The behavior is designed to fix a problem, but the fix created a new one. It will be fixed. Hopefully that fix won't cause more trouble...
Obviously, nobody likes a crash. The best behavior is to tell the user that the document is not valid, or perhaps try to correct for the error and go on. So I'm fairly confident that the developers involved will examine the crash reports, fix the issue in the Word codebase, and continue on. Whether that fix shows up in a patch depends on the impact, risk, cost, etc.
As has often been seen, every patch has a certain potential to disrupt the rest of the system or uncover a new (possibly worse) issue. It is very possible that by fixing the cause of the crash, Microsoft would allow the system to continue into behavior worse than crashing. Pushing out a patch is not the right thing to do if the risk of fixing the issue is high and the value of fixing the issue is low.
Severity is hard to define. If the bug is only going to happen if the user opens a bad document, and the bug immediately exits the system and prevents the user from viewing the bad document, I'm not too worried. If the bug causes the system to slow down and the user has to end the process, I'm still not too worried. If the bug causes the system to stop responding, then I'm a bit worried and it needs to be fixed as soon as the patch can be carefully tested to ensure it doesn't cause other problems; it probably will show up in an optional patch, KB article, and/or service pack. If the bug exposes a security vulnerability, I'm really worried and you'll probably see a patch for this issue next Patch Tuesday.
There are a lot of conditions where a programming error will be caught by the system, resulting in a "safe crash". Certain system APIs have built-in assertions that will immediately terminate the application if faulty behavior is noticed. While it would be great for the software to be bug-free, we all know that bug-free software is pretty hard to come by at any cost. Microsoft has made a big investment in setting things up so that when a potentially dangerous bug is encountered at runtime, the program exits immediately and optionally sends a crash dump to Microsoft. This process is sometimes called "InstaDeath". For example, the compiler's stack buffer overrun protection will trigger InstaDeath if it detects a stack or heap buffer overflow. An attempt to call an unregistered exception handler will trigger InstaDeath. The secure CRT functions (highly recommended!) will optionally trigger InstaDeath (mostly because if you don't expect and properly handle string truncation, that can also be a security issue).
While this might result in more crashes, it also results in the following good behavior:
- The crash occurs at the precise point of the unexpected condition instead of some random time later (or not at all). Instead of allowing the attacker's code to run or corrupting data, the app exits immediately. The crash dump points directly at the place where the unexpected situation was encountered, making a fix much more likely.
- Entire classes of bugs will now always result in an immediate crash every time. While crashing is never ideal, it's better than security issues or data corruption. Bugs have to be found and fixed one by one. If a class of bugs now always causes an immediate crash, it can never be used for an exploit (though it can be a DoS).
Again, crashing is never good. But in some cases, it is the right response. And if the crash occurs via the appropriate mechanism, it is a controlled crash that cannot be exploited (except for DoS).
I'm sure the bugs will be investigated, and if appropriate, patches will be issued. However, the security researchers at Microsoft are not wrong in saying that crashing is better than continuing on in a bad state. And they're not wrong in saying that crashing is not always a security flaw.
One last real example. I recently read about an exploitable browser flaw that was only exploitable because the browser did not crash. The exploit data caused an access violation nearly every time. Once in a great while, the exploit data successfully attacked the system. Under no
It's also important to note that both 32-bit and 64-bit Vista maintain a concept of kernel "taint". If you load an unsigned driver, the kernel is marked as tainted. Apps have the option of refusing to run if the kernel is tainted. It is expected that in the future, WinDVD and similar products will do this. (You CAN load an unsigned driver in 64-bit Vista, but you have to turn off the on-by-default restriction each time you boot.)
Of course, if you load a driver that is unsigned, it can try to mark the kernel as untainted again. Then Microsoft will issue a patch to prevent this, then the driver will be updated to work with the patch, then Vista will be updated again, then the driver will be updated again...
Facts: Lua looks kinda like Pascal. The source is compiled to bytecode (at runtime or ahead of time) for efficiency. The license is not at all restrictive. The engine is about 90k compiled, 112k if you add all standard libraries. It runs very quickly for an interpreted language. The language is very simple (grammar fits on a page). The engine handles huge programs with no trouble. Lua implements closures (lexical scoping). Unreferenced data is garbage collected. A kind of exception handling is available (essentially try-catch with no filter). Most functionality is provided via C libraries.
Nice things: The C-to-Lua and Lua-to-C interface is probably best-in-breed (extremely easy to work with, way easier than embedding Python or Perl).
Interesting things: Native types include number (only one kind, "double" by default), immutable string (binary-safe), boolean, nil, table (hashtable), function (either a Lua function or a C function), and userdata (opaque blobs of data that C code can use). The only data structure is the table. Tables are used to implement structures (map a string member name to a value), arrays (map a number to a value), classes (map a string to a value; functions are values too). Even the set of currently-accessible variables (the environment) is a table.
Tricky things: While the table is very flexible, it sometimes takes some stretching to make it cover all the cases. You need to learn a number of idioms to be really effective with Lua.
I've written most of a library in Lua for easy-to-write self-contained apps. It was incredibly easy to use. (Maybe someday I'll actually finish my library...)
/analyze is pretty good. If you're using one of the more expensive editions of Visual Studio, support for/analyze is built into the IDE and very convenient.
With the latest versions of the Windows SDK,/analyze becomes much more powerful./analyze has built-in models for the behavior of some CRT-defined functions, but all other functions are black boxes. The newest CRT and Windows SDK headers (as well as any.h files generated by a recent version of MIDL) have all been annotated with "SAL" annotations that tell PREfast or/analyze how to model their behavior. For example, here is strncpy:
SAL annotations are tags like "__in" that are #define'd to nothing for normal compilation, but are understood by/analyze and PREfast as indicating constraints on the parameters passed to a function and the function's return values. If one path through your code calls strncpy(dest, src, 45) when dest == NULL or dest == char[44],/analyze will flag an error.
PREfast has an extensive plugin system that is missing from/analyze. In addition,/analyze is not configurable (it's either on or off, nothing in between). Finally, because it is not configurable, some classes of warnings that are often false positives have been disabled for/analyze. But/analyze is still incredibly useful.
Nit-pick: There are some nice administrative advantages to VM, so this isn't always true. If the administrative advantages outweigh the overhead, one VM on a host could still be worthwhile.
Um, did you read my earlier post? There's a region of physical address space reserved for hardware. It varies from 500 to 900 MB, depending on your system's hardware. 4 GB - 900 MB = 3.1 GB.
A 32-bit virtual address space means that pointers are 32 bits, so there is a natural limit of 4 GB of addresses simultaneously available to a process. However, the range of physical memory mapped into the process can change. And different processes can have different ranges of memory mapped into them. Thus even though each process is limited to 4 GB at once, a process can make use of more than that by swapping data in and out of its address space (assuming appropriate OS support). Another way is to have 3 or more processes, each individually limited to 4 GB, with each be assigned their own 2 GB of physical memory, meaning that the system actually makes use of 6 GB or more.
The OS is also limited in address space, but the OS can map memory in and out of its own address space as needed. In the same way that the OS can swap memory out to disk, the OS can throw pages out of its address space when they aren't needed, then pull them back in. Thus, the OS can access more than 4 GB of RAM.
Kernel mode also uses a 32-bit virtual address space, but often drivers have to work with physical addresses. For example, a driver needs to transfer data to the hardware. So it takes the address of the memory buffer with the data, asks Windows for the corresponding physical address, and sends the physical address and the buffer length to the device. The device then does a DMA transfer to or from the given physical address and signals to the driver when it is done. If the driver doesn't properly handle 64-bit physical addresses, this process will not work correctly. By never using any physical addresses where any of the upper 32 bits are set, Windows is able to avoid one possible problem with 32-bit device drivers. (There are still many other possible problems.)
Recent x86 CPUs (Pentium Pro and later) have 36 address pins and can address 64 GB of RAM. This is done by using PAE mode. PAE mode changes the layout of the page tables. Page tables map 32-bit virtual addresses to physical addresses. Without PAE, the 32-bit virtual addresses map through 2 levels of page tables (1 level for huge pages) and are translated to 32-bit physical addresses. With PAE, the 32-bit virtual addresses map through 3 levels of page tables (2 levels for huge pages) and are translated to 64-bit physical addresses.
Side-note: PAE is also related to page execution protection, called "hardware DEP" (Microsoft term), "NX" (AMD term), and "XD" (Intel term). In 32-bit x86 processors, this can only be used in PAE mode. This is why you might see PAE mode used even on systems with less than 4 GB of memory.
One thing that can prevent access to more than 4 GB of RAM is motherboard design. PAE can only access 64 GB of memory if all 36 address pins are properly wired up on the motherboard. This is not always the case, since those extra 4 wires actually do make the motherboard just a little bit more expensive to design and manufacture. Many motherboards only have 32 address pins connected. If that is the case, no OS will be able to access more than 4 GB of address space. Since your hardware uses up the top 500-900 MB or so of address space, your system will be limited to somewhere around 3.1 GB of RAM.
Another hardware limitation is the ability of the chipset to remap RAM. If you have 4 GB of RAM, and 600 MB of address space is used up by PCI/AGP reserved areas, the only way to access the top 600 MB of RAM is to remap it into the addresses above the 4 GB boundary. Not all chipsets are able to do this, so some will just waste any RAM that happens to be shadowed by a PCI/AGP reserved region.
A software limitation is that not all drivers behave well in the presence of 64 bit physical addresses. Many assume that only the bottom 32 bits of the physical address are valid. Others don't properly handle the creation of bounce buffers when necessary (needed when transferring data from a hardware device to/from a buffer that is above the 4 GB mark in physical memory).
Once PAE mode became popular, Microsoft started getting a huge wave of crashes and blue screens that were traced to drivers failing to correctly handle 64-bit physical addresses. A decision was made to make the system more stable at a cost of possibly wasting memory. XP SP2 introduced a change such that only the bottom 32 bits of physical memory will ever be used, even if that means wasting memory. (This is also the case with Vista.)
The server Operating Systems still allow the use of larger amounts of memory, with the assumption that higher quality parts will be used and drivers will be more likely to have been tested in PAE mode with large addresses.
The default Vista configuration works great and is quite reasonable for the average non-government, non-corporate user. It makes quite reasonable tradeoffs between usability and security. XP and earlier versions of Windows definitely had some things enabled that shouldn't have been. Vista is much better about that.
The default Vista configuration does not work so great in a corporate environment. One size does NOT fit all. Because one size does not fit all, Microsoft decided to make the default work well for the user who installs Vista in isolation (home, home office, or non-domain business user). If you install into a domain, the defaults might not work so well, but you're likely to have domain group policies to fix the defaults automatically.
Microsoft has distributed some guidelines for how Vista should be set up in various situations, along with group policy templates and some tools to help administrators automatically reconfigure all machines on the domain to comply with the policy. So far, administrators seem to be happy with this arrangement.
That's the whole problem: OSS zealots know they need more supporters, but they see all non-OSS zealots as evil, even those who are actually quite decent folk. Whether it's beneficail (sic) or not is irrelevant; it's the only way they are able to think.
Sorry, I'm just kind of thinking that lumping everybody at Microsoft into a single evil entity is pretty lame. Probably about as lame as assuming that everybody who uses Linux is thinks the same way.
Windows maintains its internal clock as UTC. Things just get too messy otherwise.
Windows does not currently have working support for a CMOS (hardware) clock that is not set to local time. It converts the internal UTC time to local time before updating the hardware clock, and when reading the hardware clock, Windows assumes it is set to local time. (This is rather silly if you ask me. Unfortunately, nobody ever asks me.)
If you mean "768k", then you probably want DOS or Windows 3.0. Windows 3.1 needs more memory to run well.
If you mean "768 MB", then Windows XP or Windows 2000 will work fine.
If you mean "128 MB", then Windows 2000 will work fine. (Windows 2000 actually runs ok with 64 MB until you install a virus checker and firewall.) Windows XP will be slow without 256 MB.
I would not recommend Windows 95/98/ME for any general purpose machine. If your computer can't take Windows 2000, either upgrade or switch to Linux or FreeBSD.
If you create a new process often in a "high performance" system, you're going to be in a world of hurt whether on Unix or Win32. Yeah, it is more expensive on Win32, but processes ain't cheap on Unix either. Even thread creation is "expensive" for high-performance scenarios, which is why most well-written systems use pooling for processes and threads alike.
PostgreSQL creates a process per connection, with the idea that connections stick around for a while. There's no reason this can't be efficient on Win32. Without the Unix-compatibility layers, PostgreSQL would probably perform as well on Win32 as it does on Unix.
Apache 1.x creates a worker process that handles some number of requests before terminating. Again, this works fine on Win32. However, because it could be even better with threads, Apache 2.0 introduced the threaded model.
Microsoft does not supply a PThreads implementation for Windows. But as you've suggested, many libraries exist to map PThreads to Win32 API calls. The mapping is relatively trivial and costs almost nothing in terms of performance. Porting a threaded Linux app to Windows is actually far easier than porting a multi-process Linux app like PostgreSQL. It's easy to map the threading APIs from Linux to Win32, but things like fork() and S5 semaphores are harder to map.
Examples come and examples go. None of the stuff in that VB example is undocumented, just confusing if you're not familiar with the Win32 API. If you had asked me to do something similar in VB, I would have produced something similar without needing the example. Coming from a C++ background, that would just be the natural thing for me to do.
Though I do admit that MSDN can be confusing or even wrong at times. That's what happens when you've got that much content. Errors show up and don't get identified or fixed for years. But if you click on the "Send feedback" link, usually you get a response and the error usually gets fixed.
Why do you want to create a whole new process every time you want a new unit of scheduling? The thread is the unit at which scheduling occurs. The process is the division between address spaces. Each of these entities can be used independently. Why tie them together?
The process scheduler is not at fault here. (There's actually no such thing -- processes are not scheduled. Threads are scheduled. Just call it a scheduler.) The scheduler works just fine in any case. Switching between threads in the same process is slightly faster than switching between threads in different processes (on any OS, not just Windows) due to extra context updates and TLB flushes, but that probably isn't the biggest problem. The main issue is that it is wasteful (and sometimes a major pain) to force the developer to create a new process when all that is needed is a new thread.
If you need both a new address space and a new thread, go ahead and create a new process. But if all you need is a thread, just create a thread.
It would probably be possible to make a system that uses process-based concurrency and works well on Win32. Win32 has plenty of inter-process communication mechanisms that are very efficient, and the scheduler handles this situation with no problems. However, typically when a program gets ported to run well on Win32, it also gets ported to make use of threads instead of processes for concurrency, probably for the significant memory and process startup savings, and possibly also for the context switch savings. The result is that I can't think of any high-performance system that uses processes as the mechanism for concurrency on Windows. I don't think it is impossible, but the other mechanisms available seem to be more attractive.
(Not to mention that when you have multiple threads working within the same process, certain new concurrency mechanisms become possible - asynchronous IO, completion ports, etc. are not as useful or efficient when only one thread is allowed per process.)
Perhaps for a very specific definition of "swap", you might be correct, but swap has a more general meaning. Your definition seems to be full process swapping (where swap = suspend process, send all code and data to disk). That certainly is pretty rare, and is generally a last resort. Since it is so rare, the verbs "page" and "swap" generally get conflated and discussed together. On most systems (pretty much all systems with virtual memory support), swapping/paging definitely happens way before memory fills completely, as you could see using any system monitor like top (Unix) or TaskMan (Windows).
No. "Enough ram for file caching" is approximately infinite RAM, so the premise is flawed. If you did somehow have "enough" RAM that there was no noticable benefit from having more file cache, then adding a swap file would make no difference one way or another -- the system simply wouldn't ever swap anything out. (You might still see some of the swap file as "in use", but that's just the OS doing agressive optimization. The OS will copy data from RAM to disk during idle cycles. It does this just in case there's a sudden increase in memory requests. If data has already been sent to the swap file and the OS later decides to give the RAM to some other program, the other program doesn't have to wait for the RAM to be copied to disk since it's already there -- the memory just has to be zeroed out and it is ready to reassign to the new program. But if the data is needed before some other program needs the memory, the data is in RAM ready for use, and the copy on disk doesn't hurt anything.)
In any case, it's generally best to let the OS pick what stays in RAM based on what is used most often. If you're using files more than apps, your file cache will grow at the expense of RAM reserved for apps. If you're using apps more than files, your file cache will shrink. Removing swap from the equation simply means that you've made it impossible for the OS to do the best job of optimizing your system's performance if it gets low on free RAM -- instead of keeping stuff in RAM based on what gets used most often, it has to keep stuff in RAM because it has nowhere else to put it.
That would be the case if we were trying to focus atomic forces or electrons. Instead, since we're trying to focus certain wavelengths of light, only differences that are detectable with those wavelengths of light will imact the result.
Yup. Drivers have the ability to provide access to anybody they want. It is the responsibility of the driver's author to determine whether or not to grant "update firmware" access to a particular account. You are correct that the driver should not be doing this.
Wow, I must have gotten the extra special version, cuz my Vista certainly does include xcopy.
/?
---
C:\>xcopy
Copies files and directory trees.
NOTE: Xcopy is now deprecated, please use Robocopy.
---
Though I can't believe it can be considered a replacement for xcopy.
Actually, depending on how you look at it, you could consider the Vista audio driver architecture to be much simpler. Some pieces have been separated out, but they are simpler pieces.
Creative is all up in arms because Vista doesn't have as many hooks for "audio acceleration" as XP did. But you know where that got us -- I can't count the number of times my system crashed in the audio driver, and Creative's drivers were among the worst. Microsoft got fed up with the number of crashes in the audio drivers (all blamed on MS, of course) and decided to make a simple, solid architecture. Maybe in the future they'll put some hooks back in, but for now, the driver doesn't have much to do, and that means that in many cases, all that fancy hardware doesn't have anything to do. Fine by me - all I want is to get the audio data out to my speakers reliably without crashing my system.
Anyway, the basic idea is this:
1. User mode app talks to the audio API (a DLL loaded into the App's process). The API talks to the audio service (which runs in User mode) and requests a "connection". The audio service allocates some memory and shares it with the app.
2. From this point on, most of the communication between the app and the audio service is done via the shared buffers. This is nice because there is no context switching or kernel-mode call needed to fill the buffer. The app loads the shared memory with say 10 ms worth of audio data, and the service mixes that data with the data from any other apps and puts the mixed data into a buffer. All effects and mixing is done in user mode.
3. The audio driver knows where the audio service is putting the mixed audio data. It picks up the data directly from that buffer. Again, no context switching is needed and no data needs to be copied from process to process or user mode to kernel mode. No mixing or audio processing is done in kernel mode. All the driver does is initialize the audio card and get the data from the buffer to the audio card.
The result of all of this is an audio stack where all of the complex code is provided by Microsoft, is the same code path for all audio cards, has almost no overhead (buffer copying or context switching), etc. The only issue is that the mixing and effects all need to be done in software (not a big deal - mixing and typical audio effects take much less than 1% of a modern processor's time) and that the audio service and kernel mode driver need to be reliably scheduled to run once every 10 ms or so (not a big deal if your hardware and drivers follow the ACPI rules, though I've seen some issues where a laptop motherboard is too aggressive about trying to go into deep sleep and is unable to wake up quickly enough and it misses some deadlines causing audio pops).
Yeah, I wasn't distinguishing between "DX10 mode" and "DX10 hardware" in my post. Sure, anything with new drivers and any acceleration will run in Aero, but very little DX10 hardware support is out there yet.
... oh, I forget exactly which output method it is ... one of them is incompatible with desktop composition, so it is kind of an easy way to get Vista to automatically downgrade me when I start playing something.
My laptop uses an ATI integrated graphics chip, but your analysis is otherwise accurate.
Both kinds of full screen cause trouble. The best playback method is for me to use Media Player Classic and choose
DX 10 is not designed to force anybody to do anything. It was a big change in the way DirectX works so it required significant changes in the kernel's video system and significant changes in the structure of video drivers. That kind of thing is really hard to stuff into a service pack.
I think that in the long term, the change (moving to the Vista video architecture) will be a good thing. The Vista video model seems to address a lot of real issues like sharing the 3D features of the video card (previously not a real possibility). In the short term, the change is a bit painful and offers no real benefit (just nifty eye candy and effects). If I were a game developer, I certainly wouldn't develop any games that only run on DX10.
I don't think that is entirely unexpected -- most developers still support DX8. However, just like most developers can expect most of their gamers to have DX9 hardware and software, eventually developers will be able to expect gamers to have DX10 hardware and software. Then there will be benefits.
In the meantime, I can understand some frustration. For example, due to my laptop's lousy video driver, I can't play full-screen video in DX10 (Aero transparency enabled) mode. However, if I switch to the "Basic" mode, suddenly all is well. So this is certainly painful.
It's actually intentional throttling that wasn't fully thought-out. See Mark Russinivich's blog: http://blogs.technet.com/markrussinovich/archive/2 007/08/27/1833290.aspx
Certain parts of network handling occur in the kernel with scheduling disabled, which means that they have the potential to cause glitches in the audio if too many packets are handled per second. Vista tries to provide low-latency audio, so it has less tolerance for delays than previous versions of Windows. So the audio guys talked to the networking guys and decided there would be a flag that says "please reserve some time for audio processing if an audio stream is being processed". Unfortunately, the reservation was way too agressive.
No excuses - this is a bad bug and should never have happened. The behavior is designed to fix a problem, but the fix created a new one. It will be fixed. Hopefully that fix won't cause more trouble...
Obviously, nobody likes a crash. The best behavior is to tell the user that the document is not valid, or perhaps try to correct for the error and go on. So I'm fairly confident that the developers involved will examine the crash reports, fix the issue in the Word codebase, and continue on. Whether that fix shows up in a patch depends on the impact, risk, cost, etc.
As has often been seen, every patch has a certain potential to disrupt the rest of the system or uncover a new (possibly worse) issue. It is very possible that by fixing the cause of the crash, Microsoft would allow the system to continue into behavior worse than crashing. Pushing out a patch is not the right thing to do if the risk of fixing the issue is high and the value of fixing the issue is low.
Severity is hard to define. If the bug is only going to happen if the user opens a bad document, and the bug immediately exits the system and prevents the user from viewing the bad document, I'm not too worried. If the bug causes the system to slow down and the user has to end the process, I'm still not too worried. If the bug causes the system to stop responding, then I'm a bit worried and it needs to be fixed as soon as the patch can be carefully tested to ensure it doesn't cause other problems; it probably will show up in an optional patch, KB article, and/or service pack. If the bug exposes a security vulnerability, I'm really worried and you'll probably see a patch for this issue next Patch Tuesday.
There are a lot of conditions where a programming error will be caught by the system, resulting in a "safe crash". Certain system APIs have built-in assertions that will immediately terminate the application if faulty behavior is noticed. While it would be great for the software to be bug-free, we all know that bug-free software is pretty hard to come by at any cost. Microsoft has made a big investment in setting things up so that when a potentially dangerous bug is encountered at runtime, the program exits immediately and optionally sends a crash dump to Microsoft. This process is sometimes called "InstaDeath". For example, the compiler's stack buffer overrun protection will trigger InstaDeath if it detects a stack or heap buffer overflow. An attempt to call an unregistered exception handler will trigger InstaDeath. The secure CRT functions (highly recommended!) will optionally trigger InstaDeath (mostly because if you don't expect and properly handle string truncation, that can also be a security issue).
While this might result in more crashes, it also results in the following good behavior:
- The crash occurs at the precise point of the unexpected condition instead of some random time later (or not at all). Instead of allowing the attacker's code to run or corrupting data, the app exits immediately. The crash dump points directly at the place where the unexpected situation was encountered, making a fix much more likely.
- Entire classes of bugs will now always result in an immediate crash every time. While crashing is never ideal, it's better than security issues or data corruption. Bugs have to be found and fixed one by one. If a class of bugs now always causes an immediate crash, it can never be used for an exploit (though it can be a DoS).
Again, crashing is never good. But in some cases, it is the right response. And if the crash occurs via the appropriate mechanism, it is a controlled crash that cannot be exploited (except for DoS).
I'm sure the bugs will be investigated, and if appropriate, patches will be issued. However, the security researchers at Microsoft are not wrong in saying that crashing is better than continuing on in a bad state. And they're not wrong in saying that crashing is not always a security flaw.
One last real example. I recently read about an exploitable browser flaw that was only exploitable because the browser did not crash. The exploit data caused an access violation nearly every time. Once in a great while, the exploit data successfully attacked the system. Under no
It's also important to note that both 32-bit and 64-bit Vista maintain a concept of kernel "taint". If you load an unsigned driver, the kernel is marked as tainted. Apps have the option of refusing to run if the kernel is tainted. It is expected that in the future, WinDVD and similar products will do this. (You CAN load an unsigned driver in 64-bit Vista, but you have to turn off the on-by-default restriction each time you boot.)
Of course, if you load a driver that is unsigned, it can try to mark the kernel as untainted again. Then Microsoft will issue a patch to prevent this, then the driver will be updated to work with the patch, then Vista will be updated again, then the driver will be updated again...
Facts: Lua looks kinda like Pascal. The source is compiled to bytecode (at runtime or ahead of time) for efficiency. The license is not at all restrictive. The engine is about 90k compiled, 112k if you add all standard libraries. It runs very quickly for an interpreted language. The language is very simple (grammar fits on a page). The engine handles huge programs with no trouble. Lua implements closures (lexical scoping). Unreferenced data is garbage collected. A kind of exception handling is available (essentially try-catch with no filter). Most functionality is provided via C libraries.
Nice things: The C-to-Lua and Lua-to-C interface is probably best-in-breed (extremely easy to work with, way easier than embedding Python or Perl).
Interesting things: Native types include number (only one kind, "double" by default), immutable string (binary-safe), boolean, nil, table (hashtable), function (either a Lua function or a C function), and userdata (opaque blobs of data that C code can use). The only data structure is the table. Tables are used to implement structures (map a string member name to a value), arrays (map a number to a value), classes (map a string to a value; functions are values too). Even the set of currently-accessible variables (the environment) is a table.
Tricky things: While the table is very flexible, it sometimes takes some stretching to make it cover all the cases. You need to learn a number of idioms to be really effective with Lua.
I've written most of a library in Lua for easy-to-write self-contained apps. It was incredibly easy to use. (Maybe someday I'll actually finish my library...)
/analyze is pretty good. If you're using one of the more expensive editions of Visual Studio, support for /analyze is built into the IDE and very convenient.
/analyze becomes much more powerful. /analyze has built-in models for the behavior of some CRT-defined functions, but all other functions are black boxes. The newest CRT and Windows SDK headers (as well as any .h files generated by a recent version of MIDL) have all been annotated with "SAL" annotations that tell PREfast or /analyze how to model their behavior. For example, here is strncpy:
/analyze and PREfast as indicating constraints on the parameters passed to a function and the function's return values. If one path through your code calls strncpy(dest, src, 45) when dest == NULL or dest == char[44], /analyze will flag an error.
/analyze. In addition, /analyze is not configurable (it's either on or off, nothing in between). Finally, because it is not configurable, some classes of warnings that are often false positives have been disabled for /analyze. But /analyze is still incredibly useful.
With the latest versions of the Windows SDK,
__RETURN_POLICY_DST char*
strncpy(
__out_ecount(_Count) char _Dest,
__in_z const char * _Source,
__in size_t _Count
);
SAL annotations are tags like "__in" that are #define'd to nothing for normal compilation, but are understood by
PREfast has an extensive plugin system that is missing from
Nit-pick: There are some nice administrative advantages to VM, so this isn't always true. If the administrative advantages outweigh the overhead, one VM on a host could still be worthwhile.
Um, did you read my earlier post? There's a region of physical address space reserved for hardware. It varies from 500 to 900 MB, depending on your system's hardware. 4 GB - 900 MB = 3.1 GB.
A 32-bit virtual address space means that pointers are 32 bits, so there is a natural limit of 4 GB of addresses simultaneously available to a process. However, the range of physical memory mapped into the process can change. And different processes can have different ranges of memory mapped into them. Thus even though each process is limited to 4 GB at once, a process can make use of more than that by swapping data in and out of its address space (assuming appropriate OS support). Another way is to have 3 or more processes, each individually limited to 4 GB, with each be assigned their own 2 GB of physical memory, meaning that the system actually makes use of 6 GB or more.
The OS is also limited in address space, but the OS can map memory in and out of its own address space as needed. In the same way that the OS can swap memory out to disk, the OS can throw pages out of its address space when they aren't needed, then pull them back in. Thus, the OS can access more than 4 GB of RAM.
Kernel mode also uses a 32-bit virtual address space, but often drivers have to work with physical addresses. For example, a driver needs to transfer data to the hardware. So it takes the address of the memory buffer with the data, asks Windows for the corresponding physical address, and sends the physical address and the buffer length to the device. The device then does a DMA transfer to or from the given physical address and signals to the driver when it is done. If the driver doesn't properly handle 64-bit physical addresses, this process will not work correctly. By never using any physical addresses where any of the upper 32 bits are set, Windows is able to avoid one possible problem with 32-bit device drivers. (There are still many other possible problems.)
http://support.microsoft.com/kb/929605/en-us
Recent x86 CPUs (Pentium Pro and later) have 36 address pins and can address 64 GB of RAM. This is done by using PAE mode. PAE mode changes the layout of the page tables. Page tables map 32-bit virtual addresses to physical addresses. Without PAE, the 32-bit virtual addresses map through 2 levels of page tables (1 level for huge pages) and are translated to 32-bit physical addresses. With PAE, the 32-bit virtual addresses map through 3 levels of page tables (2 levels for huge pages) and are translated to 64-bit physical addresses.
Side-note: PAE is also related to page execution protection, called "hardware DEP" (Microsoft term), "NX" (AMD term), and "XD" (Intel term). In 32-bit x86 processors, this can only be used in PAE mode. This is why you might see PAE mode used even on systems with less than 4 GB of memory.
One thing that can prevent access to more than 4 GB of RAM is motherboard design. PAE can only access 64 GB of memory if all 36 address pins are properly wired up on the motherboard. This is not always the case, since those extra 4 wires actually do make the motherboard just a little bit more expensive to design and manufacture. Many motherboards only have 32 address pins connected. If that is the case, no OS will be able to access more than 4 GB of address space. Since your hardware uses up the top 500-900 MB or so of address space, your system will be limited to somewhere around 3.1 GB of RAM.
Another hardware limitation is the ability of the chipset to remap RAM. If you have 4 GB of RAM, and 600 MB of address space is used up by PCI/AGP reserved areas, the only way to access the top 600 MB of RAM is to remap it into the addresses above the 4 GB boundary. Not all chipsets are able to do this, so some will just waste any RAM that happens to be shadowed by a PCI/AGP reserved region.
A software limitation is that not all drivers behave well in the presence of 64 bit physical addresses. Many assume that only the bottom 32 bits of the physical address are valid. Others don't properly handle the creation of bounce buffers when necessary (needed when transferring data from a hardware device to/from a buffer that is above the 4 GB mark in physical memory).
Once PAE mode became popular, Microsoft started getting a huge wave of crashes and blue screens that were traced to drivers failing to correctly handle 64-bit physical addresses. A decision was made to make the system more stable at a cost of possibly wasting memory. XP SP2 introduced a change such that only the bottom 32 bits of physical memory will ever be used, even if that means wasting memory. (This is also the case with Vista.)
The server Operating Systems still allow the use of larger amounts of memory, with the assumption that higher quality parts will be used and drivers will be more likely to have been tested in PAE mode with large addresses.
The default Vista configuration works great and is quite reasonable for the average non-government, non-corporate user. It makes quite reasonable tradeoffs between usability and security. XP and earlier versions of Windows definitely had some things enabled that shouldn't have been. Vista is much better about that.
1 91
The default Vista configuration does not work so great in a corporate environment. One size does NOT fit all. Because one size does not fit all, Microsoft decided to make the default work well for the user who installs Vista in isolation (home, home office, or non-domain business user). If you install into a domain, the defaults might not work so well, but you're likely to have domain group policies to fix the defaults automatically.
Microsoft has distributed some guidelines for how Vista should be set up in various situations, along with group policy templates and some tools to help administrators automatically reconfigure all machines on the domain to comply with the policy. So far, administrators seem to be happy with this arrangement.
Here is some relevant Microsoft-sponsored Kool-Aid (beware - if you watch it, you might be brainwashed!): http://channel9.msdn.com/Showpost.aspx?postid=283
That's the whole problem: OSS zealots know they need more supporters, but they see all non-OSS zealots as evil, even those who are actually quite decent folk. Whether it's beneficail (sic) or not is irrelevant; it's the only way they are able to think.
Sorry, I'm just kind of thinking that lumping everybody at Microsoft into a single evil entity is pretty lame. Probably about as lame as assuming that everybody who uses Linux is thinks the same way.
Just to avoid confusion on the terms -
Windows maintains its internal clock as UTC. Things just get too messy otherwise.
Windows does not currently have working support for a CMOS (hardware) clock that is not set to local time. It converts the internal UTC time to local time before updating the hardware clock, and when reading the hardware clock, Windows assumes it is set to local time. (This is rather silly if you ask me. Unfortunately, nobody ever asks me.)
I don't understand "768k".
If you mean "768k", then you probably want DOS or Windows 3.0. Windows 3.1 needs more memory to run well.
If you mean "768 MB", then Windows XP or Windows 2000 will work fine.
If you mean "128 MB", then Windows 2000 will work fine. (Windows 2000 actually runs ok with 64 MB until you install a virus checker and firewall.) Windows XP will be slow without 256 MB.
I would not recommend Windows 95/98/ME for any general purpose machine. If your computer can't take Windows 2000, either upgrade or switch to Linux or FreeBSD.
If you create a new process often in a "high performance" system, you're going to be in a world of hurt whether on Unix or Win32. Yeah, it is more expensive on Win32, but processes ain't cheap on Unix either. Even thread creation is "expensive" for high-performance scenarios, which is why most well-written systems use pooling for processes and threads alike.
PostgreSQL creates a process per connection, with the idea that connections stick around for a while. There's no reason this can't be efficient on Win32. Without the Unix-compatibility layers, PostgreSQL would probably perform as well on Win32 as it does on Unix.
Apache 1.x creates a worker process that handles some number of requests before terminating. Again, this works fine on Win32. However, because it could be even better with threads, Apache 2.0 introduced the threaded model.
Microsoft does not supply a PThreads implementation for Windows. But as you've suggested, many libraries exist to map PThreads to Win32 API calls. The mapping is relatively trivial and costs almost nothing in terms of performance. Porting a threaded Linux app to Windows is actually far easier than porting a multi-process Linux app like PostgreSQL. It's easy to map the threading APIs from Linux to Win32, but things like fork() and S5 semaphores are harder to map.
Examples come and examples go. None of the stuff in that VB example is undocumented, just confusing if you're not familiar with the Win32 API. If you had asked me to do something similar in VB, I would have produced something similar without needing the example. Coming from a C++ background, that would just be the natural thing for me to do.
Though I do admit that MSDN can be confusing or even wrong at times. That's what happens when you've got that much content. Errors show up and don't get identified or fixed for years. But if you click on the "Send feedback" link, usually you get a response and the error usually gets fixed.
Why do you want to create a whole new process every time you want a new unit of scheduling? The thread is the unit at which scheduling occurs. The process is the division between address spaces. Each of these entities can be used independently. Why tie them together?
The process scheduler is not at fault here. (There's actually no such thing -- processes are not scheduled. Threads are scheduled. Just call it a scheduler.) The scheduler works just fine in any case. Switching between threads in the same process is slightly faster than switching between threads in different processes (on any OS, not just Windows) due to extra context updates and TLB flushes, but that probably isn't the biggest problem. The main issue is that it is wasteful (and sometimes a major pain) to force the developer to create a new process when all that is needed is a new thread.
If you need both a new address space and a new thread, go ahead and create a new process. But if all you need is a thread, just create a thread.
It would probably be possible to make a system that uses process-based concurrency and works well on Win32. Win32 has plenty of inter-process communication mechanisms that are very efficient, and the scheduler handles this situation with no problems. However, typically when a program gets ported to run well on Win32, it also gets ported to make use of threads instead of processes for concurrency, probably for the significant memory and process startup savings, and possibly also for the context switch savings. The result is that I can't think of any high-performance system that uses processes as the mechanism for concurrency on Windows. I don't think it is impossible, but the other mechanisms available seem to be more attractive.
(Not to mention that when you have multiple threads working within the same process, certain new concurrency mechanisms become possible - asynchronous IO, completion ports, etc. are not as useful or efficient when only one thread is allowed per process.)