Apple Open Sources Grand Central Dispatch
bonch writes "Apple has open sourced libdispatch, also known as Grand Central Dispatch, which is technology in Snow Leopard that makes it easier for developers to take advantage of multi-core parallelism. Kernel support is not required, but performance optimizations Apple made for supporting GCD are visible in xnu. Block support in C is required and is currently available in LLVM (note that Apple has submitted their implementation of C blocks for standardization)." Update: 09/11 15:32 GMT by KD : Drew McCormack has a post up speculating on what Apple's move means to Linux and other communities (but probably not Microsoft): "...this is also very interesting for scientific developers. It may be possible to parallelize code in the not too distant future using Grand Central Dispatch, and run that code not only on Macs, but also on clusters and supercomputers."
I'm not too well versed in Cocoa development. I pushed some code that should have been in a separate thread into GCD, which requires you to use a block. All in all, I had to add an include, 1 line of code and a closing bracket.
Apple has made some seriously cool stuff here.
It's a library for task parallelism using a thread pool, introduced in Mac OS X 10.6 (Snow Leopard). Wikipedia tells all.
"We recognize that libdispatch is a new technology and you likely have many questions. Here are some documentation resources for getting started:
Introducing Blocks and Grand Central Dispatch
Concurrency Programming Guide
Grand Central Dispatch (GCD) Reference"
Blocks are sections of program that can be passed around between functions as arguments. They basically allow 'functional' programming in C.
'Sensible' is a curse word.
Blocks:
In Snow Leopard, Apple has introduced a C language extension called "blocks." Blocks add closures and anonymous functions to C and the C-derived languages C++, Objective-C, and Objective C++.
Perhaps the simplest way to explain blocks is that they make functions another form of data. C-derived languages already have function pointers, which can be passed around like data, but these can only point to functions created at compile time. The only way to influence the behavior of such a function is by passing different arguments to the function or by setting global variables which are then accessed from within the function. Both of these approaches have big disadvantages
Full Read: http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/10
Directly in line with blocks is Grand Central Dispatch (and this is, where blocks become really usefull):
GDC is a a technology to resolve the concurrency conundrum by giving programmers a very easy way to split tasks into multiple sub-tasks which can then be loaded onto different threads/cpu. All this also works with normal threading, but GDC makes the process far easier, with the intention to prepare OSX for future multicore machines:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/12
It does so by using blocks as separate tasks:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/13
"When I first heard about Grand Central Dispatch, I was extremely skeptical. The greatest minds in computer science have been working for decades on the problem of how best to extract parallelism from computing workloads. Now here was Apple apparently promising to solve this problem. Ridiculous.
But Grand Central Dispatch doesn't actually address this issue at all. It offers no help whatsoever in deciding how to split your work up into independently executable tasksâ"that is, deciding what pieces can or should be executed asynchronously or in parallel. That's still entirely up to the developer (and still a tough problem). What GCD does instead is much more pragmatic. Once a developer has identified something that can be split off into a separate task, GCD makes it as easy and non-invasive as possible to actually do so.
The use of FIFO queues, and especially the existence of serialized queues, seems counter to the spirit of ubiquitous concurrency. But we've seen where the Platonic ideal of multithreading leads, and it's not a pleasant place for developers.
One of Apple's slogans for Grand Central Dispatch is "islands of serialization in a sea of concurrency." That does a great job of capturing the practical reality of adding more concurrency to run-of-the-mill desktop applications. Those islands are what isolate developers from the thorny problems of simultaneous data access, deadlock, and other pitfalls of multithreading. Developers are encouraged to identify functions of their applications that would be better executed off the main thread, even if they're made up of several sequential or otherwise partially interdependent tasks. GCD makes it easy to break off the entire unit of work while maintaining the existing order and dependencies between subtasks." (source = above url)
ArsTechnica always does a pretty thorough and reasonably technical review of each OSX release, and the latest one gives a pretty good explanation of GCD as well as Blocks.
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars
The GCD stuff in particular starts on page 12, but the previous couple pages give a little bit of useful background on why it's important.
One time I threw a brick at a duck.
You don't think that libdispatch will be very genial to widespread usage, as it has a lot of OS-specific calls, which is an understandable position to take. But as an alternative you offer something whose "only caveat" is that it needs an entirely different compiler to build. A compiler whose most recent activity dates from two years ago.
... How is that a superior alternative?
You are posting in a thread about the fact that Apple made their implementation open source and you are claiming vendor lock-in?
Are you one of those rabid Apple-haters we see so often around here? Or are you just amazingly stupid?
GCD and OpenMP have very little in common. OpenMP is a language extension. It requires the programmer to understand what environment their program is going to run in, what variables can be shared and how, etc. GCD merely asks you to identify blocks of code that are independent and it handles parsing them out to threads, variable replication, etc. It's the difference between providing detailed blueprints of a car (the OpenMP way) and just saying "I want a car" (the GCD way). You can *almost* think of GCD as a user-friendly frontend for OpenMP.
What if you're running two applications that both are capable of monopolizing all your cpu time? How will your app know that it's only going to get 50% of the available cpu time form the OS, so it should only start threads for half the cpus?
GCD decides how many threads a collection of tasks should be split across. If an app running on an 8-core machine wants to run 100 tasks, then they could be spread across anywhere from 1 to 8 threads, depending on what else is running. Since it's the OS that knows what else is running, it can make more intelligent decisions about how many threads should be running.
Grand central dispatch has many innovations, but the key feature it provides is that thread pooling is now handled by the OS not the program. This means that in a dynamic environment you don't have each application stepping on each other when they ask for too many threads --all total-- than the multi-core system can optimally handle. So if Mail asks for fifty threads and Firefox asks for fifty threads and CPU you are running on can realistically only handle 10 threads then GCD figures out how to manage things so you don't get a spinning beachball.
It turns out a lot of tricks were required to do this including a lot of things like just in time compiling LVM and this C-Blocks stuff, but that's way over my head.
Some drink at the fountain of knowledge. Others just gargle.
I've come to really like GCD; I haven't played with it much in Cocoa (Obj-C) but I've been moving some of the stuff I wrote a long time ago in C to use it and I think I can say that what it does is *really* *really* awesome. It helps when writing code to be run in parallel; it does is not help you in determining *what* should be done in parallel. By putting your work into queues, by way of closures (yeah, blocks, whatever...I'm sticking with the closure name), it's up to the underlying OS to determine what thread gets what work, and on what processor. Having worked with multithreaded stuff on Windows, and calling GetThreadAffinityMask or whatever it was, and being told that it's just a *hint* to the OS, which is free to ignore you, which it always did, GCD really does spread out the work evenly among my 16-proc MacPro, and then turns around and does it just as well on the dual-core mini.
I've wanted something like this for years; a really decent OS thread scheduler that divides up the work on the other processors in a sensible fashion. I was even looking into how much effort it would take to write something like this from scratch for Linux, and now I don't even have to. Sweet!
Caveats: This is in OS X only, so no iPhone GCD (at least, not yet...not really necessary until we have multi-core iPhones), and while I've lived with additions to C++ through the years (templates mostly), the idea of adding, well, anything to C seems strange, let alone something as run-time dependent as closures.