Multicore Requires OS Rework, Windows Expert Says
alphadogg writes "With chip makers continuing to increase the number of cores they include on each new generation of their processors, perhaps it's time to rethink the basic architecture of today's operating systems, suggested Dave Probert, a kernel architect within the Windows core operating systems division at Microsoft. The current approach to harnessing the power of multicore processors is complicated and not entirely successful, he argued. The key may not be in throwing more energy into refining techniques such as parallel programming, but rather rethinking the basic abstractions that make up the operating systems model. Today's computers don't get enough performance out of their multicore chips, Probert said. 'Why should you ever, with all this parallel hardware, ever be waiting for your computer?' he asked. Probert made his presentation at the University of Illinois at Urbana-Champaign's Universal Parallel Computing Research Center."
Oh please, this has been coming for years now. Why has it taken so long for the OS designers to get with the program? We've had multi-CPU servers for literally decades.
'Why should you ever, with all this parallel hardware, ever be waiting for your computer?'
Because I/O is always going to be slow.
Sent from my PDP-11
The problem is that most (if not all) peripheral hardware is not parallel in many senses. Hardware in today's computers is serial: You access one device, then another, then another. There are some cases (such as a few good emulators) which use muti-threaded emulation (sound in one thread, graphics in another) but fundamentally the biggest performance kill is the final IRQs that get called to process data. The structure of modern day computers must change to take advantage of multicore systems.
...the implementation sucks.
Why for example does Windows Explorer decide to freeze ALL network connections when a single URN isn't quickly resolved? Why is it that when my USB drive wakes up, all explorer windows freeze? If you are trying to tell me there's no way using the current abstractions to implement this I say you're mad. For that matter when a copy or move fails in Explorer, why can't I simply resume it once I've fixed whatever the problem is. You're left piecing together what has and hasn't been moved. File requests make up a good deal of what we're waiting for. It's not the bus or the drives that are usually the limitation. It's the shitty coding. I can live with a hit at startup. I can live with delays if I have to eat into swap. But I'm sick and tired of basic functionality being missing or broken.
These posts express my own personal views, not those of my employer
Fist post!
I come to /. to read tech news... not so see people fisting.
Interesting.
Microsoft should go back and read some of the literature on parallel computing from 20-30 years ago. Machines with many cores are nothing new. And Microsoft could have designed for it if they hadn't been busy re-implementing a bloated version of VMS.
Windows explorer sucks. It always just abandons copies after a fail - even if you're moving thousands of files over a network. Yes, you're left wondering which files did/didn't make it. It's actually easier to sometimes copy all the files you want to shift locally, then move the copy, so that you can resume after a fail. It's laughable you have to do this, however.
But it's not a concurrency issue, and neither, really, are the first 2 problems you mention. They're also down to Windows Explorer sucking.
I come to /. to read tech news... not so see people fisting.
Well, I came here to see the fisting. And frankly, so far this site has been a real disappointment.
I don't care if it's 90,000 hectares. That lake was not my doing.
Are you running a 9 year old version of OSX too, or are you comparing a two generation old Windows version to a nice new Mac version? It really sounds like you are comparing apples (snicker) to oranges. After all, both Vista and Windows 7 have no problem running for a long, long time between reboots and don't get slow during that time.
A big problem is the event-driven model of most user interfaces. Almost anything that needs to be done is placed on a serial event queue, which is then processed one event at a time. This prevents race conditions within the GUI, but at a high cost. Both the Mac and Windows started that way, and to a considerable extent, they still work that way. So any event which takes more time than expected stalls the whole event queue. There are attempts to fix this by having "background" processing for events known to be slow, but you have to know which ones are going to be slow in advance. Intermittently slow operations, like an DNS lookup or something which infrequently requires disk I/O, tend to be bottlenecks.
Most languages still handle concurrency very badly. C and C++ are clueless about concurrency. Java and C# know a little about it. Erlang and Go take it more seriously, but are intended for server-side processing. So GUI programmers don't get much help from the language.
In particular, in C and C++, there's locking, but there's no way within the language to even talk about which locks protect which data. Thus, concurrency can't be analyzed automatically. This has become a huge mess in C/C++, as more attributes ("mutable", "volatile", per-thread storage, etc.) have been bolted on to give some hints to the compiler. There's still race condition trouble between compilers and CPUs with long look-ahead and programs with heavy concurrency.
We need better hard-compiled languages that don't punt on concurrency issues. C++ could potentially have been fixed, but the C++ committee is in denial about the problem; they're still in template la-la land, adding features few need and fewer will use correctly, rather than trying to do something about reliability issues. C# is only slightly better; Microsoft Research did some work on "Polyphonic C#", but nobody seems to use that. Yes, there are lots of obscure academic languages that address concurrency. Few are used in the real world.
Game programmers have more of a clue in this area. They're used to designing software that has to keep the GUI not only updated but visually consistent, even if there are delays in getting data from some external source. Game developers think a lot about systems which look consistent at all times, and come gracefully into synchronization with outside data sources as the data catches up. Modern MMORPGs do far better at handling lag than browsers do. Game developers, though, assume they own most of the available compute resources; they're not trying to minimize CPU consumption so that other work can run. (Nor do they worry too much about not running down the battery, the other big constraint today.)
Incidentally, modern tools for hardware design know far more about timing and concurrency than anything in the programming world. It's quite possible to deal with concurrency effectively. But you pay $100,000 per year per seat for the software tools used in modern CPU design.
I wish I could mod you higher than +5, you just summed up some of the things that bother me most about the OS that is somehow still the most popular desktop OS in the world.
.
To anyone using Windows (XP, Vista or 7) right now, go ahead and open up an Explorer window, and type in ftp:// followed by any url.
Even when it's a name that obviously won't resolve, or an ip of your very own local network of a machine that just doesn't exist, this'll hang your Explorer window for a couple of solid seconds. If you're a truly patient person, try doing that with a name that does resolve, like ftp://microsoft.com . Better yet, try stopping it.... say goodbye to your explorer.exe
This is one of the worst user experiences possible, all for a mundane task like using ftp. And this has been present in Windows for what, a decade?
+1 Funny Signature
I'm thinking you don't have much experience with .NET. During my projects it has always run comparable to native compiled code when I write my code with the mindset of a C++ programmer and not a VB one.
It seems you are severely underestimating what GCD means to the application developer. I strongly suggest you read parts 12 and 13 of John Siracusa's excellent review very carefully. As Siracusa says,
Those with some multithreaded programming experience may be unimpressed with the GCD. So Apple made a thread pool. Big deal. They've been around forever. But the angels are in the details. Yes, the implementation of queues and threads has an elegant simplicity, and baking it into the lowest levels of the OS really helps to lower the perceived barrier to entry, but it's the API built around blocks that makes Grand Central Dispatch so attractive to developers. Just as Time Machine was "the first backup system people will actually use," Grand Central Dispatch is poised to finally spread the heretofore dark art of asynchronous application design to all Mac OS X developers. I can't wait.
That's actually pretty good typing with your fists. Do you have a comically large keyboard?
I love how Microsoft can come along in 2010 and with a straight face say it's about time they took multiprocessing seriously. Or say it's about time we started putting HTML5 features into our browser. And we're finally going to support the ISO audio video standard from 2002. And by the way, it's about time we let you know that our answer to the 2007 iPhone will be shipping in 2011. And look how great it is that we just got 10% of our platform modernized off the 2001 XP version! And our office suite is just about ready to discover that the World Wide Web exists. It's like they are in a time warp.
I know they have product managers instead of product designers, and so have to crib design from the rest of the industry, necessitating them to be years behind, but on engineering stuff like multiprocessing, you expect them to at least have read the memo from Intel in 2005 about single cores not scaling and how the future was going to be 128 core chips before you know it.
I guess when you recognize that Windows Vista was really Windows 2003 and Windows 7 is really Windows 2005 then it makes some sense. It really is time for them to start taking multiprocessing seriously.
I am so glad I stopped using their products in 1999.
If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?
It's happening, from a source people didn't expect: portable devices. Battery life is becoming a primary feature of portable devices, and a large fraction of that comes from software efficiency. Take your average cell phone: it's probably got a half dozen cores running in it. One in the wifi, one in the baseband, maybe one doing voice codec, another doing audio decode, one (or more) doing video decode and/or 3d, and some others hiding away doing odds and ends.
The portable devices industry has been doing multi-core for ages. It's how your average cell phone manages immense power savings: you can power on/off those cores as necessary, switch their frequencies, and so on. They have engineers who understand how to do this. They're rewarded for getting it right: the reward is it lives on battery longer, and it's measurable.
Yes, you can get lazy and say 'next generation CPUs will be more efficient', but you'll be beaten by your competitors for battery life. Or, you fit a bigger battery and you lose in form factor.
The world is going mobile, and that'll be the push we need to get software efficient again.
I've always thought that both data flow languages and fortran95 had some innovations for multi-core programming worthy of being copied.
Data flow languages such as "G" which is sold as national instruments "labview" brand are intrinsically parallel at many levels. What they do is look at a function call as a list of unsatisfied inputs. These inputs are waiting for the data to arrive to make the variables valid. Then the subroutine fires. Thus every single function is potenitally a parallel process. it's just waiting on it's data. If you program in a serial fashion then of course those functions get called serially. But with graphic programming in 2D, you almost never are programming serially. You are just wiring outputs of other functions to inputs of others. Serial dependencies do arise but these are asynchronous and localized cliques. everything else is parallel. Yet you never ever ever actually write parallel code. it just happens automatically. Perl data language had a glimpse of this but it's not the same thing since the language is still perl and thus not parallel.
Objective-C with it's "message passing" abstraction is perhaps getting closer to the idea of a data flow. While one might complain that well objective-C message passing is just a different sugar coating of C just like C++ is. This would be true from the user's point of view. But it's not as true from the Operating system's point of view. IN OSX, these messages are passing more like actual socket programming at the kernel level. So there's more to objective C on apple's than meets the eye. But I don't know how far you can push that abstraction.
In fortran there are some rather simple but powerful multi-processor optimizations. First there's loops like "forall" that designate that a loop can be done in any order of the loop index and even in parallel. and then there's vectorized statements as part of the language like matrix multiplies. those are rather simple things so don't solve much but they do show that you can put a lot of compiler hinting into the language itself without re-inventing the language.
Some drink at the fountain of knowledge. Others just gargle.
First, the article in question talks about OS architecture, not Windows specifically. He specifically states that what he is speaking about is not something MS is working on. Quite the opposite, many of his MS colleagues disagree with him.
Second, the fundamental problems with OS design are exactly that: fundamental problems with OS design. Nobody is making an OS that truly takes advantage of multiple cores, it's still single-processor thinking with the ability to use more than one processor, and this leads to a number of inherent problems.
The article talks about what an OS might look like if built from scratch specifically for multiple core processing power, and there is nothing on the market like it at the moment. It's basically a hypervisor-based OS, where instead of giving programs slices of CPU time, the OS gives programs actual CPUs and slices of memory to use.
Something like that would be extremely slick, we already do that for virtual machines and we end up with 8+ full-fledged servers running on the same machine. Why can't you pull that back a little more so it's individual programs assigned to each CPU such that they don't have to interact with the OS at all once they are up and running? Can you imagine?
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
Apple's grand-central dispatch (GCD) solution is really primitive. It's just a simple thread-pool, where the programmer breaks their program down into tasks that can be executed independently then queues them for execution by the thread-pool.
GCD is not in the slightest innovative, except for a hack that allows "c" programmers to write tasks with slightly more convenience, by adding limited "closure" support to the language.
Similar concepts can be found all over the place; just see the "see also" section on the wikipedia article:
http://en.wikipedia.org/wiki/Grand_Central_Dispatch
Using any of the libs listed in that "see also" section, you can get GCD equivalent behaviour on unix/windows, and have been able to for years.
There are also languages with far superior parallel-processing abilities, where the effort is done by the compiler/environment, not the programmer. See any functional language, eg Haskell or Erlang. Write a program in these languages, and the parallel-processing happens just about automatically.
Adding parallelism to the *OS* is quite a different issue, and not one that Apple's GCD addresses.
What's wrong with at least some operating systems doesn't even have anything to do with multiple cores per se. They're simply designing the OS and its UI incorrectly, assigning the wrong priorities to events. No event should EVER supersede the ability of a user to interact and intercede with the operating system (and applications). Nothing should EVER happen to prevent a user being able to move the mouse, access the start menu, etc., yet this still happens in both Windows and Linux distributions. That's a fucked-up set of priorities, when the user sitting in front of the damned box - who probably paid for it - gets second billing when it comes to CPU cycles.
It doesn't matter if there's one CPU core or a hundred. It's the fundamental design priorities that are screwed up. Hell should freeze over before a user is denied the ability to interact, intercede, or override, regardless how many cores are present. Apparently hell has already frozen over and I just didn't get the memo?