Multicore Requires OS Rework, Windows Expert Says

← Back to Stories (view on slashdot.org)

Multicore Requires OS Rework, Windows Expert Says

Posted by timothy on Sunday March 21, 2010 @11:48AM from the ok-let's-split-up dept.

alphadogg writes "With chip makers continuing to increase the number of cores they include on each new generation of their processors, perhaps it's time to rethink the basic architecture of today's operating systems, suggested Dave Probert, a kernel architect within the Windows core operating systems division at Microsoft. The current approach to harnessing the power of multicore processors is complicated and not entirely successful, he argued. The key may not be in throwing more energy into refining techniques such as parallel programming, but rather rethinking the basic abstractions that make up the operating systems model. Today's computers don't get enough performance out of their multicore chips, Probert said. 'Why should you ever, with all this parallel hardware, ever be waiting for your computer?' he asked. Probert made his presentation at the University of Illinois at Urbana-Champaign's Universal Parallel Computing Research Center."

8 of 631 comments (clear)

Min score:

Reason:

Sort:

Current architecture flawed but workable BUT.... by syousef · 2010-03-21 11:56 · Score: 4, Interesting

...the implementation sucks.
Why for example does Windows Explorer decide to freeze ALL network connections when a single URN isn't quickly resolved? Why is it that when my USB drive wakes up, all explorer windows freeze? If you are trying to tell me there's no way using the current abstractions to implement this I say you're mad. For that matter when a copy or move fails in Explorer, why can't I simply resume it once I've fixed whatever the problem is. You're left piecing together what has and hasn't been moved. File requests make up a good deal of what we're waiting for. It's not the bus or the drives that are usually the limitation. It's the shitty coding. I can live with a hit at startup. I can live with delays if I have to eat into swap. But I'm sick and tired of basic functionality being missing or broken.

--
These posts express my own personal views, not those of my employer
reinventing the wheel by pydev · 2010-03-21 12:09 · Score: 4, Interesting

Microsoft should go back and read some of the literature on parallel computing from 20-30 years ago. Machines with many cores are nothing new. And Microsoft could have designed for it if they hadn't been busy re-implementing a bloated version of VMS.
Re:Luckily OSX is Already Has MultiCore Tech by Smurf · 2010-03-21 13:09 · Score: 4, Interesting

It seems you are severely underestimating what GCD means to the application developer. I strongly suggest you read parts 12 and 13 of John Siracusa's excellent review very carefully. As Siracusa says,

Those with some multithreaded programming experience may be unimpressed with the GCD. So Apple made a thread pool. Big deal. They've been around forever. But the angels are in the details. Yes, the implementation of queues and threads has an elegant simplicity, and baking it into the lowest levels of the OS really helps to lower the perceived barrier to entry, but it's the API built around blocks that makes Grand Central Dispatch so attractive to developers. Just as Time Machine was "the first backup system people will actually use," Grand Central Dispatch is poised to finally spread the heretofore dark art of asynchronous application design to all Mac OS X developers. I can't wait.
Re:The problem: the event-driven model by shutdown+-p+now · 2010-03-21 13:22 · Score: 4, Interesting

This has become a huge mess in C/C++, as more attributes ("mutable", "volatile", per-thread storage, etc.) have been bolted on to give some hints to the compiler.
An interesting comment overall, but what relevance does "mutable" have to multi-threaded programming? It is just a way to say that a particular field in a class is never const, even when the object itself is as a whole. There are no optimizations the compiler could possibly derive from that (in fact, if anything, it might make some optimizations non-applicable).
Same goes for "volatile", actually. It forces the code generator to avoid caching values in registers etc, and always do direct memory reads & writes on every access to a given lvalue, but this won't prevent one core from not seeing a write done by another core - you need memory barriers for that, and ISO C++ "volatile" doesn't guarantee any (nor do any existing C++ implementations).

Microsoft Research did some work on "Polyphonic C#" [psu.edu], but nobody seems to use that.
It's a research language, not intended for production use. Microsoft Research does quite a few of those - e.g. Spec# (DbC), or C-omega (this is what Polyphonic C# evolved into), or Axum (the most recent take at concurrency, Erlang-style).
Those projects are used to "cook" some idea to see if it's feasible, what approach is the best, and how it is taken by programmers. Eventually, features from those languages end up integrated into the mainstream ones - C# and VB. For example, X# became LINQ in .NET 3.5, and Spec# became Code Contracts in .NET 4.0. So, give it time.
Energy efficiency will do it by pslam · 2010-03-21 14:13 · Score: 5, Interesting

If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?
It's happening, from a source people didn't expect: portable devices. Battery life is becoming a primary feature of portable devices, and a large fraction of that comes from software efficiency. Take your average cell phone: it's probably got a half dozen cores running in it. One in the wifi, one in the baseband, maybe one doing voice codec, another doing audio decode, one (or more) doing video decode and/or 3d, and some others hiding away doing odds and ends.
The portable devices industry has been doing multi-core for ages. It's how your average cell phone manages immense power savings: you can power on/off those cores as necessary, switch their frequencies, and so on. They have engineers who understand how to do this. They're rewarded for getting it right: the reward is it lives on battery longer, and it's measurable.
Yes, you can get lazy and say 'next generation CPUs will be more efficient', but you'll be beaten by your competitors for battery life. Or, you fit a bigger battery and you lose in form factor.
The world is going mobile, and that'll be the push we need to get software efficient again.
Re:waiting by Courageous · 2010-03-21 14:34 · Score: 4, Interesting

Well, with the rise of the SSD, that's no longer as much of a problem.
ORLY!
Let's do some math shall we? Take a simple 4 core Nehalem running at 2.66Ghz. Let's conservatively assume that it can complete a mere *1* double precision floating point number per clock cycle, per core. So. How big is a double? 64 bits, or 8 bytes. Now, that's 2.66 billion * 4 = 10.64 BILLION doubles per second, which is 85 GB/s.
The trick to understanding computing is that all computing really *is* at its heart a throughput problem.
Do you see the asymmetry in throughput b/t the Nehalem and your SSD?
C//
Data flow languages by goombah99 · 2010-03-21 15:24 · Score: 5, Interesting

I've always thought that both data flow languages and fortran95 had some innovations for multi-core programming worthy of being copied.
Data flow languages such as "G" which is sold as national instruments "labview" brand are intrinsically parallel at many levels. What they do is look at a function call as a list of unsatisfied inputs. These inputs are waiting for the data to arrive to make the variables valid. Then the subroutine fires. Thus every single function is potenitally a parallel process. it's just waiting on it's data. If you program in a serial fashion then of course those functions get called serially. But with graphic programming in 2D, you almost never are programming serially. You are just wiring outputs of other functions to inputs of others. Serial dependencies do arise but these are asynchronous and localized cliques. everything else is parallel. Yet you never ever ever actually write parallel code. it just happens automatically. Perl data language had a glimpse of this but it's not the same thing since the language is still perl and thus not parallel.
Objective-C with it's "message passing" abstraction is perhaps getting closer to the idea of a data flow. While one might complain that well objective-C message passing is just a different sugar coating of C just like C++ is. This would be true from the user's point of view. But it's not as true from the Operating system's point of view. IN OSX, these messages are passing more like actual socket programming at the kernel level. So there's more to objective C on apple's than meets the eye. But I don't know how far you can push that abstraction.
In fortran there are some rather simple but powerful multi-processor optimizations. First there's loops like "forall" that designate that a loop can be done in any order of the loop index and even in parallel. and then there's vectorized statements as part of the language like matrix multiplies. those are rather simple things so don't solve much but they do show that you can put a lot of compiler hinting into the language itself without re-inventing the language.

--
Some drink at the fountain of knowledge. Others just gargle.
Answer: Yes by Bigjeff5 · 2010-03-21 16:44 · Score: 4, Interesting

First, the article in question talks about OS architecture, not Windows specifically. He specifically states that what he is speaking about is not something MS is working on. Quite the opposite, many of his MS colleagues disagree with him.
Second, the fundamental problems with OS design are exactly that: fundamental problems with OS design. Nobody is making an OS that truly takes advantage of multiple cores, it's still single-processor thinking with the ability to use more than one processor, and this leads to a number of inherent problems.
The article talks about what an OS might look like if built from scratch specifically for multiple core processing power, and there is nothing on the market like it at the moment. It's basically a hypervisor-based OS, where instead of giving programs slices of CPU time, the OS gives programs actual CPUs and slices of memory to use.
Something like that would be extremely slick, we already do that for virtual machines and we end up with 8+ full-fledged servers running on the same machine. Why can't you pull that back a little more so it's individual programs assigned to each CPU such that they don't have to interact with the OS at all once they are up and running? Can you imagine?

--
Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller