Slashdot Mirror


Multicore Requires OS Rework, Windows Expert Says

alphadogg writes "With chip makers continuing to increase the number of cores they include on each new generation of their processors, perhaps it's time to rethink the basic architecture of today's operating systems, suggested Dave Probert, a kernel architect within the Windows core operating systems division at Microsoft. The current approach to harnessing the power of multicore processors is complicated and not entirely successful, he argued. The key may not be in throwing more energy into refining techniques such as parallel programming, but rather rethinking the basic abstractions that make up the operating systems model. Today's computers don't get enough performance out of their multicore chips, Probert said. 'Why should you ever, with all this parallel hardware, ever be waiting for your computer?' he asked. Probert made his presentation at the University of Illinois at Urbana-Champaign's Universal Parallel Computing Research Center."

50 of 631 comments (clear)

  1. This is new?! by DavidRawling · · Score: 4, Insightful

    Oh please, this has been coming for years now. Why has it taken so long for the OS designers to get with the program? We've had multi-CPU servers for literally decades.

    1. Re:This is new?! by PhunkySchtuff · · Score: 5, Insightful

      Since when have OS designers optimised their code to milk every cycle from the available CPUs? They haven't, they just wait for hardware to get faster to keep up with the code.

    2. Re:This is new?! by Jeremi · · Score: 4, Insightful

      Why has it taken so long for the OS designers to get with the program?

      Coming up with a new OS paradigm is hard, but doable.

      Coming up with a viable new OS that uses that paradigm is much harder; because even once the new OS is working perfectly, you still have to somehow make it compatible with the zillions of existing applications that people depend on. If you can't do that, your shiny new OS will be viewed as an interesting experiment for the propeller-head set, but it won't ever get the critical mass of users necessary to build up its own application base.

      So far, I think Apple has had the most successful transition strategy: Come up with the great new OS, bundle the old OS with it, inside an emulator/sandbox, and after a few years, quietly deprecate (and then drop) the old OS. Repeat as necessary.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    3. Re:This is new?! by Cryacin · · Score: 5, Insightful

      For that matter, since when have software vendors been willing to pay architects/designers/engineers etc to optimise their software to milk every cycle from the available CPUs and provide useful output with the minimum of effort? They don't, they just wait for hardware to get faster to keep up with code.

      The only company that I have personally been exposed to that gives half a hoot about efficient performance is Google. It annoys me beyond belief that other companies think it's acceptable to make the user wait for minutes whilst the system recalculates data derived from a large data set, and doing those calculations multiple times just because a binding gets invoked.

      --
      Science advances one funeral at a time- Max Planck
    4. Re:This is new?! by Bengie · · Score: 3, Informative

      developing server apps to run parallel is easy, client software is hard. Many times, the cost of syncing threads is greater than the work you get from them. So you leave it single threaded. The question is, how may you design a Framework/API that is very thread friendly while making sure everything runs in the order expected all the while making it easy for bad programmers to take advantage of it.

      The biggest issue with developing async-threaded programs is logical dependencies that don't allow part to be loaded/processed before another. If from square one, you develop an app to take advantage of extra threads, it may be less efficient, but more responsive. Most programmers I talk to have issues trying to understand the interweaving logic of multi-threaded programing.

      I guess it's up to MS to make a easy to use idiot-proof threaded framework for crappy programmers to use.

    5. Re:This is new?! by jc42 · · Score: 4, Insightful

      Since when have OS designers optimised their code to milk every cycle from the available CPUs?

      This isn't just an OS-level problem. It's a failure among programmers of all sorts.

      I've been involved in software development since the late 1970s, and for the start I've heard the argument "We don't have to worry about code speed or size, because today's machines are so fast and have so much memory. This was just as common back when machines were 1,000 times slower and had 10,000 times less memory than today.

      It's the reason for Henry Petroski's famous remark that "The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer hardware industry."

      Programmers respond to faster cpu speed and more memory by making their software use more cpu cycles and more memory. They always have, and there's no sign that this is going to change. Being efficient is hard, and you don't get rewarded for it, because managers can't measure it. So it's better to add flashy eye candy and more features, which people can see.

      If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    6. Re:This is new?! by fuzzyfuzzyfungus · · Score: 4, Insightful

      I doubt that it's just google. I suspect the following:

      There are(in broad strokes, and excluding the embedded market), two basic axes on which you have to place a company or a company's software offering in order to predict its attitude with respect to efficiency.

      One is problem scale. If a program is a once-off, or an obscure niche thing, or just isn't expected to have to cope with very large data sets, putting a lot of effort into making it efficient will likely not be a priority. If the program is extremely widely distributed, or is expected to cope with massive datasets, efficiency is much more likely to be considered important(if widely distributed, cost of efficient engineering per unit falls dramatically, if expeced to cope with massive datasets, amount of hardware cost and energy cost avoided becomes significant. Tuning a process that eats 50% of a desktop CPU into one that eats 40% probably isn't worth it. Tuning a process that runs on 50,000 servers into one that runs on 40,000 easily could be).

      The second is location: If a company is running their software on their own hardware, and selling access to whatever service it provides(search engine, webmail, whatever), their software's efficiency or inefficiency imposes a direct cost on them. Their customers are paying so much per mailbox, or so much per search query, they have an incentive to use as little computer power as possible to deliver that product. If a company is selling boxed software, to be run on customer machines, their efficiency incentives are indirect. This doesn't mean "nonexistent"(a game that only runs on $2,000 enthusiast boxes is going to lose money, nobody would release such a thing. Among enthusiasts, browser JS benchmarks are a point of contention); but it generally does mean "secondary to other considerations". Customers, as a rule, are more likely to use slow software with the features they want, or slow software that released first and they became accustomed to, than fast software that is missing features or requires substantial adjustment on their part. Shockingly enough, software developers act on this fact.

      On these axes, you would strongly suspect that Google would be efficiency oriented. Their software runs on a grand scale, and most of it runs on their own servers, with the rest competing against various desktop incumbents, or not actually all that dramatically efficient(Nothing wrong with Google Earth or Sketchup; but nothing especially heroic, either). However, you would expect roughly the same of any entity similarly placed on those axes.

    7. Re:This is new?! by PhunkySchtuff · · Score: 3, Interesting

      I don't know if you noticed my sig, but I'm pretty familiar with what Apple have been up to these past few years ;-)

      What I was getting at was that, in general, programmers simply don't have the time or money to really optimise their code and now that computers are, for all intents and purposes, fast enough to not really worry about optimisations.

      Apple are doing a lot of good, as you mention, with things like Grand Central Dispatch, but the multiprocessing features in earlier versions of OS X, and even more OS 9, were nothing that was in any major way any better than that offered by, say, Windows or other Unix based OSs. In fact, in the Mac OS 9 days, the multiprocessing capabilities of Mac OS lagged quite far behind that of Windows NT at the time.

    8. Re:This is new?! by Jane+Q.+Public · · Score: 3, Interesting

      Mod parent up!

      The fact is, the vast majority of programmers (and their tools) are not going to change virtually everything they do in order to deal with multiple cores. And there's good reason for that: it hugely complicates what could otherwise be fairly simple tasks. As the number of cores expand, it gets worse to the point of simply not being practical. This is a job that properly belongs in the OS or hardware layer.

      Is it harder to design a system that decides for itself how to go about threading and multiprocessing, rather than relying on the programmer to know when it is best for that particular program? Yes! But that is irrelevant, because in the long run, that is the way it must be done. There is no other practical choice.

      I had to laugh at Intel a few years ago when they called for end-product programmers to start programming for their multicore processors. I say, "No, Intel. It is you who must cater to the programmers. They are your customers, and essential suppliers of your other customers. It is your job to make sure that your processors do what the programmers want, not the other way around!"

      Apple's decision to put provision for this in their Snow Leopard OS is a clear demonstration of their forward (and practical) thinking. Where are all the others?

    9. Re:This is new?! by stevew · · Score: 5, Informative

      Well - I can tell you that Dave Probert saw his first multi-processor about 28 years ago at Burroughs corporation. It was a dual-processor B1855. I had the pleasure with working with the guy way back then. From what I recall he then went on to work at FPS systems which was an array processor that you could add onto other machines (I think vaxen...but I could be wrong there..)

      Anyway - he has been around ALONG time.

      --
      Have you compiled your kernel today??
    10. Re:This is new?! by Brian+Gordon · · Score: 4, Insightful

      Maybe it's not a question of whether the code is efficient. Maybe it's a question of how much you're asking the code to do. It's no surprise that hardware struggles to make gains against performance demands when software developers are adding on nonsense like compositing window managers and sidebar widgets. I'm enjoying Moore's law without any cancellation.. just run a sane environment. Qt or GTK, not both, if youre running an X desktop. Nothing other than IM in the system tray. No "upgrade fever" that makes people itch for Windows Media Player 14 when older versions work fine and mplayer and winamp work better.

    11. Re:This is new?! by Mr.+Freeman · · Score: 4, Insightful

      Because Google ain't crunching data sets on fucking mobile phones. They're optimizing their servers and the applications that run on those servers because Google is so damn big that a fraction of a percent increase in efficiency translates into huge amounts of money saved through less wasted CPU time. Mobile phones aren't a part of google.

      If you phone runs a little less efficient then no one gives a damn. They want to make their phones easy to program for, which generally conflicts with efficiency.

      --
      -1 disagree is not a modifier for a reason. -1 troll, flaimbait, redundant, overrated are NOT acceptable substitutes.
    12. Re:This is new?! by amRadioHed · · Score: 3, Informative

      The iPhone certainly doesn't outperform a Nexus One. If you compare browser rendering tests the Nexus One consistently completes loading pages quite a bit faster then the iPhone. You are probably thinking of games performance, and while it's true that the iPhone gets better frame rates, you're forgetting that the Nexus One is pushing around 2.5 times more pixels so that's not exactly an apples to apples comparison.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    13. Re:This is new?! by dudpixel · · Score: 3, Informative

      Come up with the great new OS...

      hang on, this "new" OS you're referring to is basically UNIX (BSD). It was invented before Windows. Sure apple has modified it and put a shiny new layer on top (that works exceptionally smoothly, mind you), but if you wanna get technical, they didn't come up with a new OS, they improved an old one.

      --
      This seemed like a reasonable sig at the time.
    14. Re:This is new?! by jc42 · · Score: 3, Insightful

      Hey, if you liked programming for a one-byte machine, maybe you should join the quantum computer research effort. They're just now looking forward to the creation of their first 8-bit "computer" in the very near future. ;-)

      Of course, you can do a bit more computing with 8 Q-bits than you can with 8 of the more mundane bits that the rest of us are using.

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    15. Re:This is new?! by mjwx · · Score: 5, Informative

      Then if you look at iPhone OS, that has been highly, highly optimized. An iPhone 3GS with a 600MHz CPU outperforms a Nexus One with a 1000MHz CPU. The iPhone 3G with a 400MHz CPU outperforms a Palm Pre with 600MHz CPU

      Citation needed? I think you'll find that Iphone only appears to outperform Android because Android is doing a lot more then the Iphone. Further more many things that work on Android do not work on Iphone, slashdot for instance works fine on my HTC Dream or newer Motorola Milestone with the standard browser, it works even better with Dolphin browser.

      This cannot be a fair comparison until the Iphone can do everything that Android phones can, unless you want to compare functionality where Iphone is an epic failure.

      Those optimizations are part of the reason why Apple is currently undercutting both Android and Palm on price,

      Now I can tell you're full of it. All prices are incl of local taxes, and UK VAT does not apply outside the EU for those in Australia, Canada and the US.

      UK Expansys
      Motorola Milestone GBP 379
      Nexus 1 GBP 599
      Iphone 32 GB GBP 799

      AU Mobicity
      Motorola Milestone A$659
      Nexus 1 A$849
      Iphone 16 GB A$959

      The cheapest Iphone 3GS available is A$100 more expensive then the newer Motorola Milestone (droid for the Yanks) and Google Nexus One. Not to mention that both the Milestone and Nexus One can do more as well as lack the restrictions of the Iphone. But then again I suspect you were merely looking to confirm your quite obvious bias rather then do an accurate comparison.

      Apple's operating systems are not very well optimised, not even as much as Windows operating systems, Apple's OS pretend to have optimisation by providing the OS with more hardware then it needs and limiting functionality to prevent any perceived loss of speed. Most people using a Mac or Iphone rarely use the full power of the hardware, ergo an un-optimised OS goes unnoticed by the user. Here is the core of the design (in an engineering perspective) a design does not have to work well, it just has to work. The vast majority of people will ignore tiny flaws if they can get the task done, OTOH if a computer doesn't do the task the user will get annoyed no matter how pretty the interface.

      As a good developer friend of mine likes to say, "If given the choice, a user will press the 'I just want it to work today' button". OSX provides this very shiny button but only in a few select places, Windows provides this not so shiny button almost everywhere. This is why Windows is still the number one OS on the planet.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    16. Re:This is new?! by Anonymous Coward · · Score: 3, Interesting

      I don't think you understood the point he was trying to make. Windows has had threading since 1993 and a threadpool API since before OS X was released. The point he was making was not that Windows wasn't good enough for multiple cores, it was that the current paradigm about how OSes and apps relate wasn't good enough.

      Back when you only had a single core CPU, the OS had to share the CPU with all the apps. Thus arose the kernel/user model where the OS ran in kernel mode and the apps ran in user mode. When an app needed some system service it would stop running, the CPU would switch to kernel mode, perform the server, and go back to user mode so the app could resume. When multiple CPUs and then multiple cores per CPU became available, this model was simply expanded so that the OS ran on every CPU core. This is called the SMP (Symmetric MultiProcessing) model because every processor core has the same duties as all the others.

      I think what he's saying is that having the OS run on every core means that data structures it uses will have to be shared across all the cores in the system, causing problems like contention and false sharing. It sounds like he is considering what would happen if the OS just ran on some cores and apps ran on others. If an app needs a system service it need not stop running, switch into kernel mode, run the OS, etc. Instead it could send a message to one of the cores that the OS is running on and go about its business, hopefully staying more responsive that way. Obviously the app can't have full control of the CPU because it has to share the computer nicely, but it doesn't need a fully-blown kernel either, so the thin supervisor layer is what he related to a hypervisor.

      It may be hard to imagine a 256-core computer because Apple doesn't make any, but Windows can already run on 256 cores. Of course those are huge server boxes, but it won't be long before it's common to have desktop boxes with 256 logical CPUs (2 sockets, 32 cores/socket, 4 threads/core), and then you can imagine that a high-end server might have upwards of 2048 cores. At that point does it even make sense to have the OS running on hundreds or thousands of cores simultaneously? Probably not.

      I'm not saying that this guy has the right solution, but he has some interesting ideas worth considering.

      dom

    17. Re:This is new?! by IamTheRealMike · · Score: 4, Insightful

      Why Java for Android? This is a good question. There are several reasons (that the Android team have discussed).

      One is that ARM native code is bigger, size-wise, than Dalvik VM bytecode. So it takes up more memory. Unlike the iPhone, Android was designed from the start to multi-task between lots of different (user installed) apps. It's quite feasible to rapidly switch between apps with no delay on Android, and that means keeping multiple running programs in RAM simultaneously. So trading off some CPU time for memory is potentially a good design. Now that said, Java has some design issues that make it more profligate with heap memory than it maybe needs to be (eg utf16 for strings) so I don't have a good feel for whether the savings are cancelled out or not, but it's a justification given by the Android team.

      Another is that Java is dramatically easier to program than a C-like language. I mean, incredibly monstrously easier. One problem with languages like C++ or Objective-C is that lots of people think they understand them but very few programmers really do. Case in point - I have an Apple-mad friend who ironically programs C# servers on Windows for his day job. But he figured he'd learn iPad development. I warned him that unmanaged development was a PITA but he wasn't convinced, so I showed him a page that discussed reference counting in ObjC (retain/release). He read it and said "well that seems simple enough" - doh. Another one bites the dust. I walked him through cycle leaks, ref leaks on error paths (no smart pointers in objc!), and some basic thread safety issues. By the end he realized that what looked simple really wasn't at all.

      By going with Java, Android devs skip that pain. I'm fluent in C++ and Java, and have used both regularly in the past year. Java is reliably easier to write correct code in. I don't think it's unreasonable to base your OS on it. Microsoft has moved a lot of Windows development to .NET over the last few years for the same reasons.

      Fortunately, being based on Java doesn't mean Android is inherently inefficient. Large parts of the runtime are written in C++, and you can write parts of your own app in native code too (eg for 3D graphics). You need to use Java to use most of the OS APIs but you really shouldn't be experiencing perf problems with things like gui layout - if you are, that's a hint you need to simplify your app rather than try to micro-optimize.

    18. Re:This is new?! by julesh · · Score: 3, Informative

      One is that ARM native code is bigger, size-wise, than Dalvik VM bytecode.

      Citation needed. Dalvik is better than baseline Java bytecode, agreed. But so is ARM native code. [http://portal.acm.org/citation.cfm?id=377837&dl=GUIDE&coll=GUIDE&CFID=82959920&CFTOKEN=24064384 - "[...] the code efficiency of Java turns out to be inferior to that of ARM Thumb"]. I can find no direct comparison of ARM Thumb and Dalvik, so I can't tell you which produces the smaller code size.

      So it takes up more memory.

      Even if your first statement is true, this doesn't necessarily follow. VMs add overhead, usually using up somewhat more runtime memory to execute, particularly if a JIT is used (the current version of Dalvik doesn't have one, but the next one apparently will).

  2. waiting by mirix · · Score: 5, Insightful

    'Why should you ever, with all this parallel hardware, ever be waiting for your computer?'

    Because I/O is always going to be slow.

    --
    Sent from my PDP-11
    1. Re:waiting by DavidRawling · · Score: 4, Insightful

      Well, with the rise of the SSD, that's no longer as much of a problem. Case in point - I built a system on the weekend with a 40GB Intel SSD. Pretty much the cheapest "known-good" SSD I could get my hands on (ie TRIM support, good controller) at AUD $172, roughly the price of a 1.5TB spinning rust store - and the system only needs 22GB including apps.

      Windows boots from end of POST in about 5 seconds. 5 seconds is not even enough for the TV to turn on (it's a Media Center box). Logon is instant. App start is nigh-on instant (I've never seen Explorer appear seemingly before the Win+E key is released). This is the fastest box I've ever seen, and it's the most basic "value" processor Intel offer - the i3-530, on a cheap Asrock board with cheap RAM (true, there's a slightly cheaper "bargain basement" CPU in the G6950 or something). The whole PC cost AUD800 from a reputable supplier, and I could have bought for $650 if I'd wanted to wait in line for an hour or get abused at the cheaper places.

      Now, Intel are aiming to saturate SATA-3 (600MBps) with the next generation(s) of SSD, or so I'm told. Based on what I've seen - it's achievable, at reasonable cost, and it's not only true for sequential read access. So if the IO bottleneck disappears - because the SSD can do 30K, 50K, 100K IO operations per second? Yeah, I think it's reasonable to ask why we wait for the computer.

      Not that I think a redesign is necessary for the current architectures - Windows, BSD, Linux all scale nicely to at least 8 or 16 logical CPUs in the server world, so the 4, 6 or 8 on the desktop isn't a huge problem. But in 5 years when we have 32 CPUs on the desktop? Maybe. Or maybe we'll just be using the same apps that only need 1 CPU most of the time, and using the other 20 CPUs for real-time stuff (Real voice control? Motion control and recognition?)

    2. Re:waiting by Courageous · · Score: 4, Interesting

      Well, with the rise of the SSD, that's no longer as much of a problem.

      ORLY!

      Let's do some math shall we? Take a simple 4 core Nehalem running at 2.66Ghz. Let's conservatively assume that it can complete a mere *1* double precision floating point number per clock cycle, per core. So. How big is a double? 64 bits, or 8 bytes. Now, that's 2.66 billion * 4 = 10.64 BILLION doubles per second, which is 85 GB/s.

      The trick to understanding computing is that all computing really *is* at its heart a throughput problem.

      Do you see the asymmetry in throughput b/t the Nehalem and your SSD?

      C//

  3. The problem isnt even that simple by indrora · · Score: 5, Insightful

    The problem is that most (if not all) peripheral hardware is not parallel in many senses. Hardware in today's computers is serial: You access one device, then another, then another. There are some cases (such as a few good emulators) which use muti-threaded emulation (sound in one thread, graphics in another) but fundamentally the biggest performance kill is the final IRQs that get called to process data. The structure of modern day computers must change to take advantage of multicore systems.

  4. Grand Central? by volfreak · · Score: 3, Insightful

    Isn't this the reason for Apple to have rolled out GrandCentral in Snow Leopard? If so, it seems it's not THAT hard to do - at least not that hard for a non-Windows OS.

    1. Re:Grand Central? by jasmusic · · Score: 4, Insightful

      I'm thinking you don't have much experience with .NET. During my projects it has always run comparable to native compiled code when I write my code with the mindset of a C++ programmer and not a VB one.

  5. Current architecture flawed but workable BUT.... by syousef · · Score: 4, Interesting

    ...the implementation sucks.

    Why for example does Windows Explorer decide to freeze ALL network connections when a single URN isn't quickly resolved? Why is it that when my USB drive wakes up, all explorer windows freeze? If you are trying to tell me there's no way using the current abstractions to implement this I say you're mad. For that matter when a copy or move fails in Explorer, why can't I simply resume it once I've fixed whatever the problem is. You're left piecing together what has and hasn't been moved. File requests make up a good deal of what we're waiting for. It's not the bus or the drives that are usually the limitation. It's the shitty coding. I can live with a hit at startup. I can live with delays if I have to eat into swap. But I'm sick and tired of basic functionality being missing or broken.

    --
    These posts express my own personal views, not those of my employer
  6. Re:Because by Anonymous Coward · · Score: 3, Insightful

    Why should you ever, with all this parallel hardware, ever be waiting for your computer?' he asked.

    Because it might be waiting for I/O.

    That's no reason for the entire GUI to freeze on Windows when you insert a CD.

  7. Re:Fist post! by SilverEyes · · Score: 4, Funny

    Fist post!

    I come to /. to read tech news... not so see people fisting.

    --
    Interesting.
  8. reinventing the wheel by pydev · · Score: 4, Interesting

    Microsoft should go back and read some of the literature on parallel computing from 20-30 years ago. Machines with many cores are nothing new. And Microsoft could have designed for it if they hadn't been busy re-implementing a bloated version of VMS.

  9. Re:Current architecture flawed but workable BUT... by Threni · · Score: 5, Insightful

    Windows explorer sucks. It always just abandons copies after a fail - even if you're moving thousands of files over a network. Yes, you're left wondering which files did/didn't make it. It's actually easier to sometimes copy all the files you want to shift locally, then move the copy, so that you can resume after a fail. It's laughable you have to do this, however.

    But it's not a concurrency issue, and neither, really, are the first 2 problems you mention. They're also down to Windows Explorer sucking.

  10. Re:Fist post! by Jeremi · · Score: 4, Funny

    I come to /. to read tech news... not so see people fisting.

    Well, I came here to see the fisting. And frankly, so far this site has been a real disappointment.

    --


    I don't care if it's 90,000 hectares. That lake was not my doing.
  11. Re:I hate to say it, but... by GIL_Dude · · Score: 4, Insightful

    Are you running a 9 year old version of OSX too, or are you comparing a two generation old Windows version to a nice new Mac version? It really sounds like you are comparing apples (snicker) to oranges. After all, both Vista and Windows 7 have no problem running for a long, long time between reboots and don't get slow during that time.

  12. Multithreading is the problem, not the answer by Anonymous Coward · · Score: 3, Interesting

    The Problem with Threads (UC Berkeley's Prof Edward Lee)
    How to Solve the Parallel Programming Crisis
    Half a Century of Crappy Computing

    The computer industry will have to wake up to reality sooner or later. We must reinvent the computer; there is no getting around this. The old paradigms from the 20th century do not work anymore because they were not designed for parallel processing.

  13. Duh by Waffle+Iron · · Score: 3, Funny

    Why should you ever, with all this parallel hardware, ever be waiting for your computer?'

    For a lot of problems, for the same reason that some guy who just married 8 brides will still have to wait for his baby.

    1. Re:Duh by keeboo · · Score: 3, Insightful

      Why should you ever, with all this parallel hardware, ever be waiting for your computer?'

      For a lot of problems, for the same reason that some guy who just married 8 brides will still have to wait for his baby.

      Of course, he'll be able to get 8 babies at once, assuming none of the processes crash during the computation.

      That improves bandwidth, but not latency: almost 1 baby/month, but 9 months of latency.
      The guy could try interleaving the pregnancies, in order to get the illusion of lower latency.

  14. The problem: the event-driven model by Animats · · Score: 4, Informative

    A big problem is the event-driven model of most user interfaces. Almost anything that needs to be done is placed on a serial event queue, which is then processed one event at a time. This prevents race conditions within the GUI, but at a high cost. Both the Mac and Windows started that way, and to a considerable extent, they still work that way. So any event which takes more time than expected stalls the whole event queue. There are attempts to fix this by having "background" processing for events known to be slow, but you have to know which ones are going to be slow in advance. Intermittently slow operations, like an DNS lookup or something which infrequently requires disk I/O, tend to be bottlenecks.

    Most languages still handle concurrency very badly. C and C++ are clueless about concurrency. Java and C# know a little about it. Erlang and Go take it more seriously, but are intended for server-side processing. So GUI programmers don't get much help from the language.

    In particular, in C and C++, there's locking, but there's no way within the language to even talk about which locks protect which data. Thus, concurrency can't be analyzed automatically. This has become a huge mess in C/C++, as more attributes ("mutable", "volatile", per-thread storage, etc.) have been bolted on to give some hints to the compiler. There's still race condition trouble between compilers and CPUs with long look-ahead and programs with heavy concurrency.

    We need better hard-compiled languages that don't punt on concurrency issues. C++ could potentially have been fixed, but the C++ committee is in denial about the problem; they're still in template la-la land, adding features few need and fewer will use correctly, rather than trying to do something about reliability issues. C# is only slightly better; Microsoft Research did some work on "Polyphonic C#", but nobody seems to use that. Yes, there are lots of obscure academic languages that address concurrency. Few are used in the real world.

    Game programmers have more of a clue in this area. They're used to designing software that has to keep the GUI not only updated but visually consistent, even if there are delays in getting data from some external source. Game developers think a lot about systems which look consistent at all times, and come gracefully into synchronization with outside data sources as the data catches up. Modern MMORPGs do far better at handling lag than browsers do. Game developers, though, assume they own most of the available compute resources; they're not trying to minimize CPU consumption so that other work can run. (Nor do they worry too much about not running down the battery, the other big constraint today.)

    Incidentally, modern tools for hardware design know far more about timing and concurrency than anything in the programming world. It's quite possible to deal with concurrency effectively. But you pay $100,000 per year per seat for the software tools used in modern CPU design.

    1. Re:The problem: the event-driven model by shutdown+-p+now · · Score: 4, Interesting

      This has become a huge mess in C/C++, as more attributes ("mutable", "volatile", per-thread storage, etc.) have been bolted on to give some hints to the compiler.

      An interesting comment overall, but what relevance does "mutable" have to multi-threaded programming? It is just a way to say that a particular field in a class is never const, even when the object itself is as a whole. There are no optimizations the compiler could possibly derive from that (in fact, if anything, it might make some optimizations non-applicable).

      Same goes for "volatile", actually. It forces the code generator to avoid caching values in registers etc, and always do direct memory reads & writes on every access to a given lvalue, but this won't prevent one core from not seeing a write done by another core - you need memory barriers for that, and ISO C++ "volatile" doesn't guarantee any (nor do any existing C++ implementations).

      Microsoft Research did some work on "Polyphonic C#" [psu.edu], but nobody seems to use that.

      It's a research language, not intended for production use. Microsoft Research does quite a few of those - e.g. Spec# (DbC), or C-omega (this is what Polyphonic C# evolved into), or Axum (the most recent take at concurrency, Erlang-style).

      Those projects are used to "cook" some idea to see if it's feasible, what approach is the best, and how it is taken by programmers. Eventually, features from those languages end up integrated into the mainstream ones - C# and VB. For example, X# became LINQ in .NET 3.5, and Spec# became Code Contracts in .NET 4.0. So, give it time.

    2. Re:The problem: the event-driven model by thesuperbigfrog · · Score: 3, Interesting

      Most languages still handle concurrency very badly. C and C++ are clueless about concurrency. Java and C# know a little about it. Erlang and Go take it more seriously, but are intended for server-side processing. So GUI programmers don't get much help from the language.

      In particular, in C and C++, there's locking, but there's no way within the language to even talk about which locks protect which data. Thus, concurrency can't be analyzed automatically. This has become a huge mess in C/C++, as more attributes ("mutable", "volatile", per-thread storage, etc.) have been bolted on to give some hints to the compiler. There's still race condition trouble between compilers and CPUs with long look-ahead and programs with heavy concurrency.

      We need better hard-compiled languages that don't punt on concurrency issues. C++ could potentially have been fixed, but the C++ committee is in denial about the problem; they're still in template la-la land, adding features few need and fewer will use correctly, rather than trying to do something about reliability issues. C# is only slightly better; Microsoft Research did some work on "Polyphonic C#", but nobody seems to use that. Yes, there are lots of obscure academic languages that address concurrency. Few are used in the real world.

      Ada 2005's task model is a real world, production quality approach to include concurrency in a hard-compiled language. Ada isn't exactly known for its GUI libraries (there is GtkAda), but it could be used as a foundation for an improved concurrent GUI paradigm.

      This book covers the subject quite well.

      --
      42
  15. Re:Current architecture flawed but workable BUT... by Kenz0r · · Score: 4, Insightful

    I wish I could mod you higher than +5, you just summed up some of the things that bother me most about the OS that is somehow still the most popular desktop OS in the world.

    To anyone using Windows (XP, Vista or 7) right now, go ahead and open up an Explorer window, and type in ftp:// followed by any url.
    Even when it's a name that obviously won't resolve, or an ip of your very own local network of a machine that just doesn't exist, this'll hang your Explorer window for a couple of solid seconds. If you're a truly patient person, try doing that with a name that does resolve, like ftp://microsoft.com . Better yet, try stopping it.... say goodbye to your explorer.exe .

    This is one of the worst user experiences possible, all for a mundane task like using ftp. And this has been present in Windows for what, a decade?

    --
    +1 Funny Signature
  16. Re:I hate to say it, but... by drsmithy · · Score: 3, Interesting

    I noticed the same on my mac. With a set of eight CPU graph meters in the menu bar, they're almost always evenly pitched anywhere from idle to 100%, with a few notable exceptions like second life, some photoshop filters, and firefox of all things.
    When booted into Win, more often than not I have two cores pegged high, and the others idle. Getting even use out of all cores is the exception, not the rule.

    This is pretty much completely down to the application mix. Windows has no trouble whatsoever scheduling processes and threads to max out 8 (or 16, or whatever) CPUs, but if the applications are only coded to have, say, 1 or 2 "processing" threads, then there's nothing the OS can do to change that.

  17. Re:Luckily OSX is Already Has MultiCore Tech by shutdown+-p+now · · Score: 3, Insightful

    The trick with GCD is that it is somewhat more high-level than a simple thread pool - it operates in terms of tasks, not threads. The difference is that tasks have explicit dependencies on other tasks - this lets scheduler be smarter about allocating cores.

  18. Re:Luckily OSX is Already Has MultiCore Tech by Smurf · · Score: 4, Interesting

    It seems you are severely underestimating what GCD means to the application developer. I strongly suggest you read parts 12 and 13 of John Siracusa's excellent review very carefully. As Siracusa says,

    Those with some multithreaded programming experience may be unimpressed with the GCD. So Apple made a thread pool. Big deal. They've been around forever. But the angels are in the details. Yes, the implementation of queues and threads has an elegant simplicity, and baking it into the lowest levels of the OS really helps to lower the perceived barrier to entry, but it's the API built around blocks that makes Grand Central Dispatch so attractive to developers. Just as Time Machine was "the first backup system people will actually use," Grand Central Dispatch is poised to finally spread the heretofore dark art of asynchronous application design to all Mac OS X developers. I can't wait.

  19. Re:Fist post! by omfgnosis · · Score: 4, Funny

    That's actually pretty good typing with your fists. Do you have a comically large keyboard?

  20. Microsoft's slowness and Windows 2005 by gig · · Score: 5, Insightful

    I love how Microsoft can come along in 2010 and with a straight face say it's about time they took multiprocessing seriously. Or say it's about time we started putting HTML5 features into our browser. And we're finally going to support the ISO audio video standard from 2002. And by the way, it's about time we let you know that our answer to the 2007 iPhone will be shipping in 2011. And look how great it is that we just got 10% of our platform modernized off the 2001 XP version! And our office suite is just about ready to discover that the World Wide Web exists. It's like they are in a time warp.

    I know they have product managers instead of product designers, and so have to crib design from the rest of the industry, necessitating them to be years behind, but on engineering stuff like multiprocessing, you expect them to at least have read the memo from Intel in 2005 about single cores not scaling and how the future was going to be 128 core chips before you know it.

    I guess when you recognize that Windows Vista was really Windows 2003 and Windows 7 is really Windows 2005 then it makes some sense. It really is time for them to start taking multiprocessing seriously.

    I am so glad I stopped using their products in 1999.

  21. Energy efficiency will do it by pslam · · Score: 5, Interesting

    If we want efficient code, we have to figure out ways to reward the programmers that write it. I don't see any sign that people anywhere are interested in doing this. Anyone have suggestions for how it might be done?

    It's happening, from a source people didn't expect: portable devices. Battery life is becoming a primary feature of portable devices, and a large fraction of that comes from software efficiency. Take your average cell phone: it's probably got a half dozen cores running in it. One in the wifi, one in the baseband, maybe one doing voice codec, another doing audio decode, one (or more) doing video decode and/or 3d, and some others hiding away doing odds and ends.

    The portable devices industry has been doing multi-core for ages. It's how your average cell phone manages immense power savings: you can power on/off those cores as necessary, switch their frequencies, and so on. They have engineers who understand how to do this. They're rewarded for getting it right: the reward is it lives on battery longer, and it's measurable.

    Yes, you can get lazy and say 'next generation CPUs will be more efficient', but you'll be beaten by your competitors for battery life. Or, you fit a bigger battery and you lose in form factor.

    The world is going mobile, and that'll be the push we need to get software efficient again.

  22. Data flow languages by goombah99 · · Score: 5, Interesting

    I've always thought that both data flow languages and fortran95 had some innovations for multi-core programming worthy of being copied.

    Data flow languages such as "G" which is sold as national instruments "labview" brand are intrinsically parallel at many levels. What they do is look at a function call as a list of unsatisfied inputs. These inputs are waiting for the data to arrive to make the variables valid. Then the subroutine fires. Thus every single function is potenitally a parallel process. it's just waiting on it's data. If you program in a serial fashion then of course those functions get called serially. But with graphic programming in 2D, you almost never are programming serially. You are just wiring outputs of other functions to inputs of others. Serial dependencies do arise but these are asynchronous and localized cliques. everything else is parallel. Yet you never ever ever actually write parallel code. it just happens automatically. Perl data language had a glimpse of this but it's not the same thing since the language is still perl and thus not parallel.

    Objective-C with it's "message passing" abstraction is perhaps getting closer to the idea of a data flow. While one might complain that well objective-C message passing is just a different sugar coating of C just like C++ is. This would be true from the user's point of view. But it's not as true from the Operating system's point of view. IN OSX, these messages are passing more like actual socket programming at the kernel level. So there's more to objective C on apple's than meets the eye. But I don't know how far you can push that abstraction.

    In fortran there are some rather simple but powerful multi-processor optimizations. First there's loops like "forall" that designate that a loop can be done in any order of the loop index and even in parallel. and then there's vectorized statements as part of the language like matrix multiplies. those are rather simple things so don't solve much but they do show that you can put a lot of compiler hinting into the language itself without re-inventing the language.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  23. Answer: Yes by Bigjeff5 · · Score: 4, Interesting

    First, the article in question talks about OS architecture, not Windows specifically. He specifically states that what he is speaking about is not something MS is working on. Quite the opposite, many of his MS colleagues disagree with him.

    Second, the fundamental problems with OS design are exactly that: fundamental problems with OS design. Nobody is making an OS that truly takes advantage of multiple cores, it's still single-processor thinking with the ability to use more than one processor, and this leads to a number of inherent problems.

    The article talks about what an OS might look like if built from scratch specifically for multiple core processing power, and there is nothing on the market like it at the moment. It's basically a hypervisor-based OS, where instead of giving programs slices of CPU time, the OS gives programs actual CPUs and slices of memory to use.

    Something like that would be extremely slick, we already do that for virtual machines and we end up with 8+ full-fledged servers running on the same machine. Why can't you pull that back a little more so it's individual programs assigned to each CPU such that they don't have to interact with the OS at all once they are up and running? Can you imagine?

    --
    Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  24. Apple Grand Central Sucks by Anonymous Coward · · Score: 4, Informative

    Apple's grand-central dispatch (GCD) solution is really primitive. It's just a simple thread-pool, where the programmer breaks their program down into tasks that can be executed independently then queues them for execution by the thread-pool.

    GCD is not in the slightest innovative, except for a hack that allows "c" programmers to write tasks with slightly more convenience, by adding limited "closure" support to the language.

    Similar concepts can be found all over the place; just see the "see also" section on the wikipedia article:
        http://en.wikipedia.org/wiki/Grand_Central_Dispatch
    Using any of the libs listed in that "see also" section, you can get GCD equivalent behaviour on unix/windows, and have been able to for years.

    There are also languages with far superior parallel-processing abilities, where the effort is done by the compiler/environment, not the programmer. See any functional language, eg Haskell or Erlang. Write a program in these languages, and the parallel-processing happens just about automatically.

    Adding parallelism to the *OS* is quite a different issue, and not one that Apple's GCD addresses.

  25. It's not even about multiple cores by macraig · · Score: 4, Insightful

    What's wrong with at least some operating systems doesn't even have anything to do with multiple cores per se. They're simply designing the OS and its UI incorrectly, assigning the wrong priorities to events. No event should EVER supersede the ability of a user to interact and intercede with the operating system (and applications). Nothing should EVER happen to prevent a user being able to move the mouse, access the start menu, etc., yet this still happens in both Windows and Linux distributions. That's a fucked-up set of priorities, when the user sitting in front of the damned box - who probably paid for it - gets second billing when it comes to CPU cycles.

    It doesn't matter if there's one CPU core or a hundred. It's the fundamental design priorities that are screwed up. Hell should freeze over before a user is denied the ability to interact, intercede, or override, regardless how many cores are present. Apparently hell has already frozen over and I just didn't get the memo?

  26. Re:Fist post! by Whalou · · Score: 3, Funny

    I'd browse at -2 if I could.

    --
    English is not this .sig mother tongue...