Apple Open Sources Grand Central Dispatch
bonch writes "Apple has open sourced libdispatch, also known as Grand Central Dispatch, which is technology in Snow Leopard that makes it easier for developers to take advantage of multi-core parallelism. Kernel support is not required, but performance optimizations Apple made for supporting GCD are visible in xnu. Block support in C is required and is currently available in LLVM (note that Apple has submitted their implementation of C blocks for standardization)." Update: 09/11 15:32 GMT by KD : Drew McCormack has a post up speculating on what Apple's move means to Linux and other communities (but probably not Microsoft): "...this is also very interesting for scientific developers. It may be possible to parallelize code in the not too distant future using Grand Central Dispatch, and run that code not only on Macs, but also on clusters and supercomputers."
Can somebody explain what this "blocks" is? I mean, C being a block-structured language, I thought it already supported them...
I'm not too well versed in Cocoa development. I pushed some code that should have been in a separate thread into GCD, which requires you to use a block. All in all, I had to add an include, 1 line of code and a closing bracket.
Apple has made some seriously cool stuff here.
It's a library for task parallelism using a thread pool, introduced in Mac OS X 10.6 (Snow Leopard). Wikipedia tells all.
"We recognize that libdispatch is a new technology and you likely have many questions. Here are some documentation resources for getting started:
Introducing Blocks and Grand Central Dispatch
Concurrency Programming Guide
Grand Central Dispatch (GCD) Reference"
There's a decent article on wikipedia about it. Basically, it's Apple's multithreading algorithms.
>> "What would the robut do? Frame someone!"
From the first paragraph:
Grand Central Dispatch (GCD) is a revolutionary approach to multicore computing that is woven throughout the fabric of Mac OS X version 10.6 Snow Leopard. GCD combines an easy-to-use programming model with highly-efficient system services to radically simplify the code needed to make best use of multiple processors. The technologies in GCD improve the performance, efficiency, and responsiveness of Snow Leopard out of the box, and will deliver even greater benefits as more developers adopt them.
libdispatch is the open source implementation of GCD.
Gentlemen! You can't fight in here, this is the war room!
It is operating system infrastructure that allows programmers to harness modern programmable graphics hardware and regular CPUs as a single anonymous "multi-threaded" resource.
The idea being that now GPUs are becoming double precision capable they are looking increasingly like super computers from yester-year and Grand Central Dispatch allows programmers to easily capitalise on this advance.
Blocks:
In Snow Leopard, Apple has introduced a C language extension called "blocks." Blocks add closures and anonymous functions to C and the C-derived languages C++, Objective-C, and Objective C++.
Perhaps the simplest way to explain blocks is that they make functions another form of data. C-derived languages already have function pointers, which can be passed around like data, but these can only point to functions created at compile time. The only way to influence the behavior of such a function is by passing different arguments to the function or by setting global variables which are then accessed from within the function. Both of these approaches have big disadvantages
Full Read: http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/10
Directly in line with blocks is Grand Central Dispatch (and this is, where blocks become really usefull):
GDC is a a technology to resolve the concurrency conundrum by giving programmers a very easy way to split tasks into multiple sub-tasks which can then be loaded onto different threads/cpu. All this also works with normal threading, but GDC makes the process far easier, with the intention to prepare OSX for future multicore machines:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/12
It does so by using blocks as separate tasks:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/13
"When I first heard about Grand Central Dispatch, I was extremely skeptical. The greatest minds in computer science have been working for decades on the problem of how best to extract parallelism from computing workloads. Now here was Apple apparently promising to solve this problem. Ridiculous.
But Grand Central Dispatch doesn't actually address this issue at all. It offers no help whatsoever in deciding how to split your work up into independently executable tasksâ"that is, deciding what pieces can or should be executed asynchronously or in parallel. That's still entirely up to the developer (and still a tough problem). What GCD does instead is much more pragmatic. Once a developer has identified something that can be split off into a separate task, GCD makes it as easy and non-invasive as possible to actually do so.
The use of FIFO queues, and especially the existence of serialized queues, seems counter to the spirit of ubiquitous concurrency. But we've seen where the Platonic ideal of multithreading leads, and it's not a pleasant place for developers.
One of Apple's slogans for Grand Central Dispatch is "islands of serialization in a sea of concurrency." That does a great job of capturing the practical reality of adding more concurrency to run-of-the-mill desktop applications. Those islands are what isolate developers from the thorny problems of simultaneous data access, deadlock, and other pitfalls of multithreading. Developers are encouraged to identify functions of their applications that would be better executed off the main thread, even if they're made up of several sequential or otherwise partially interdependent tasks. GCD makes it easy to break off the entire unit of work while maintaining the existing order and dependencies between subtasks." (source = above url)
ArsTechnica always does a pretty thorough and reasonably technical review of each OSX release, and the latest one gives a pretty good explanation of GCD as well as Blocks.
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars
The GCD stuff in particular starts on page 12, but the previous couple pages give a little bit of useful background on why it's important.
One time I threw a brick at a duck.
It's OpenCL that opens up the GPUs to general processing on 10.6. Although GCD certainly plays a role by dispatching threads to those resources.
You should check out this astounding OpenCL demo here: http://www.macresearch.org/opencl_episode1
Abstraction is the main theme in progression in programming. Blocks (Language change) and the rest of the package provide such an abstracted view.
GCD also does not need kernel changes as written in the sumary.
pthreads and fork/exec are the equivalent of assembly language for parallelism compared to GCD. The API makes it easy to create anonymous methods that can be parallelized, have dependencies, be put in serial or parallel queues, etc. Then the OS implementation can prioritize at a finely-grained level based on dynamic resource availability, relative process priority, etc., on a system-wide basis. (The OS implementation of GCD was already open-sourced as part of 10.6's Darwin xnu kernel release last week.)
It's pretty nifty stuff. And it's good to see Apple continue MacOS X's tradition of openness and support of open source.
E pluribus unum
My understanding of GCD is that it makes parallel programming and, importantly, interacting with the UI, pretty damned easy.
And despite that, if the only thing you can think of to do with Blocks is threading, then you seriously need to get back to learning your CompSci. Resource Allocation Is Invocation becomes a load more practical, and that alone would mark a major shift between C code that's mostly resource and error management overhead and code that actually does something.
It looks like GCD is very similar to OpenMP. I am always biased toward using an open standard, when possible. Since many compiler vendors support OpenMP, why didn't apple just implement that for Objective-C, instead of creating their own threading solution? Judging from the examples, GCD looks much cleaner and simpler. But that often comes with a price.
It's an old tech. But it's different this time around.
Old thread pools are per process. This is a thread pool for the whole system. And that's new.
IOW, with GCD you do not need to configure every application how much threads it should start. Applications do not need to bother with it anymore too: they simply queue batch tasks as they arrive and GCD guarantees that they will be executed. Without overloading system.
Shortly, GCD is a system-wide replacement for old per-application thread pool configuration. Makes applications simpler and also doesn't force end-user to understand all oddities of multi-programming to get most out of their boxes.
All hope abandon ye who enter here.
"Apple has open sourced libdispatch, also known as Grand Central Dispatch, which is technology in Snow Leopard that makes it easier for developers to take advantage of multi-core parallelism."
First line of THE SUMMARY.
I know it's not hip to RTFA, but it's at least a minimum requirement to read the very next line after the title, even while scrolling down eagerly to make a comment.
Why parallel programming has to be tied to a kernel change and to a language spec change, when a good library (OpenMP, anyone?, but I'm sure there are others) will suffice...
OpenMP requires "language changes" - it introduces new compiler keywords, and the compiler must support it, it's not "a library".
Get yourself a clue.
Well, I can see the logic of your concerns but Apple do actually seem to be fairly good at open sourcing infrastructure-related things. Sure they maintain a tight control on the user-facing stuff that makes Apple products distinctive - on the iPhone they even maintain a tight control on applications. But bear in mind they're working on an open source kernel, employ developers to work on the LLVM compiler (open source), open sourced an init-ish daemon (launchd) they developed, etc etc. On stuff that's "for geeks" they seem fairly enlightened wrt open source.
It's quite surprising from a company like Apple but the fact that they manage to make surprising decisions like that looks like a strong technical management team at work, to me.
Everyone who reads slashdot isn't an OSX ween and has no idea what "Grand Central Dispatch" is. Perhaps a sentence or two describing why it is important/useful would save users from following the link which doesn't provide that info either.
Look I'm as annoyed by poor summaries as anyone, but it seems almost reflexive to complain about them these days. The summary clearly said it, "makes it easier for developers to take advantage of multi-core parallelism". I don't care if you've never heard of OS X and no nothing about it. That sentence right there tells you what effect it has and why it's useful. After that it's just details as to where and if it is applicable to some other project. I guess what I'm saying is, I thought this was a pretty decent summary, enough to know if you should read about it.
Apple has a long history of open source contributions since OS X. Apple has released parts of OS X as Darwin under a BSD license since they first released OS X. OS X was developed from OPENSTEP which came from NextStep which itself was based on BSD. The kernel itself is derived from XNU which was based on the Mach kernel. All of these components are covered by BSD licenses. From what I understand since Apple uses a lot of open source Unix programs like CUPS, etc, they do contribute to fixes and patches on a regular basis.
Well, there's spam egg sausage and spam, that's not got much spam in it.
also known as Grand Central Dispatch, which is technology in Snow Leopard that makes it easier for developers to take advantage of multi-core parallelism.
They kinda' snuck that one in.
I drank what? -- Socrates
You don't think that libdispatch will be very genial to widespread usage, as it has a lot of OS-specific calls, which is an understandable position to take. But as an alternative you offer something whose "only caveat" is that it needs an entirely different compiler to build. A compiler whose most recent activity dates from two years ago.
... How is that a superior alternative?
GCD is designed for small, short-running blocks of code. Read the Ars article on Snow Leopard for examples. Naturally, it will also handle longer-running threads gracefully.
Whoever modded you informative is as ignorant as you.
Note that OpenMP is supported in Mac OS X, and has been for a while. It's just not as user-friendly, you have to think a lot more about variable scope and dependencies.
With Grand Central Dispatch, you basically only have to flip a boolean flag and it's running in parallel (in ObjC at least).
Apple actually owns CUPS now. But, yes, they still make it available under an OSS license.
Shoot, I already noticed the difference on my 2.5 yr old Mac Pro (1.1). First boot on 10.6 and I was like "wow, feels like a new machine again". All of the bundled apps have been recompiled (64 bit) and cleaned up (and apparently take advantage of GCD everywhere possible). I really didn't think I would see that much of a difference with 10.6 and really only upgraded because I could for $29 (I mean at that price, why not right?) I am very happy with my $29 purchase thus far. I've only had to work through a couple app incompatibilities (and as I have been able to work around them just fine, I am happy.) This is of course just my experience thus far with 10.6. I have no hard benchmark numbers for you. But I noticed right away the smoothness it brought to my older Mac Pro. And it was an easier upgrade than going from 10.4 --> 10.5.
If they were violating the GPL they probably wouldn't have released it under the Apache 2.0 license..
You are posting in a thread about the fact that Apple made their implementation open source and you are claiming vendor lock-in?
Are you one of those rabid Apple-haters we see so often around here? Or are you just amazingly stupid?
No, GDC is purely a "C with block extensions" API. The blocks are essentially anonymous methods you can pass directly in to functions. It's integrated at a much lower level than Objective-C, which is only used for the higher-level application framework in MacOS.
Apple open-sourced the GDC API with this announcement, block extensions to C with LLVM implementation last week, and the OS support necessary as part of the xnu kernel Darwin release for 10.6.
E pluribus unum
Here is the list of all the open source packages apple uses. These include the kernel and CUPS (under the 10.6 tab), as well as their own modified version of other open source packages like java or gcc (under the developer tab). Contrary to a lot of a the common "apple is teh proprietary satan!!1" posts on slashdot, apple acts just like you might expect a more or less decent proprietary unix distributor to act: they open source what they can but keep closed whatever they feel is necessary to maintain a competitive advantage against Microsoft or that would infringe on hardware sales.
P.S. As in interesting tidbit, you'll notice that clamAV is posted there as well. Hmm, makes you wonder.
Gentlemen! You can't fight in here, this is the war room!
What if you're running two applications that both are capable of monopolizing all your cpu time? How will your app know that it's only going to get 50% of the available cpu time form the OS, so it should only start threads for half the cpus?
GCD decides how many threads a collection of tasks should be split across. If an app running on an 8-core machine wants to run 100 tasks, then they could be spread across anywhere from 1 to 8 threads, depending on what else is running. Since it's the OS that knows what else is running, it can make more intelligent decisions about how many threads should be running.
Grand central dispatch has many innovations, but the key feature it provides is that thread pooling is now handled by the OS not the program. This means that in a dynamic environment you don't have each application stepping on each other when they ask for too many threads --all total-- than the multi-core system can optimally handle. So if Mail asks for fifty threads and Firefox asks for fifty threads and CPU you are running on can realistically only handle 10 threads then GCD figures out how to manage things so you don't get a spinning beachball.
It turns out a lot of tricks were required to do this including a lot of things like just in time compiling LVM and this C-Blocks stuff, but that's way over my head.
Some drink at the fountain of knowledge. Others just gargle.
SMP = Symmetric Multi-Processing. GCD, in theory, will also allow access to Asymmetric Multi-Processing as well since you can take advantage of GPU resources and cores as well.
I've come to really like GCD; I haven't played with it much in Cocoa (Obj-C) but I've been moving some of the stuff I wrote a long time ago in C to use it and I think I can say that what it does is *really* *really* awesome. It helps when writing code to be run in parallel; it does is not help you in determining *what* should be done in parallel. By putting your work into queues, by way of closures (yeah, blocks, whatever...I'm sticking with the closure name), it's up to the underlying OS to determine what thread gets what work, and on what processor. Having worked with multithreaded stuff on Windows, and calling GetThreadAffinityMask or whatever it was, and being told that it's just a *hint* to the OS, which is free to ignore you, which it always did, GCD really does spread out the work evenly among my 16-proc MacPro, and then turns around and does it just as well on the dual-core mini.
I've wanted something like this for years; a really decent OS thread scheduler that divides up the work on the other processors in a sensible fashion. I was even looking into how much effort it would take to write something like this from scratch for Linux, and now I don't even have to. Sweet!
Caveats: This is in OS X only, so no iPhone GCD (at least, not yet...not really necessary until we have multi-core iPhones), and while I've lived with additions to C++ through the years (templates mostly), the idea of adding, well, anything to C seems strange, let alone something as run-time dependent as closures.
Don't forget they've contributed a lot to WebKit, which now seems to be used by every new browser that comes out.
So you're saying gcc is being used for vendor lock in?
It's quite surprising from a company like Apple...
Companies are defined by what they do. If a person surprised Apple is releasing technologies as open source projects, that just means that person has an inaccurate image of what kind of company Apple is and should pay more attention to what Apple does and less to espoused, unsupported opinions from astroturfers and zealots.
Good support for OpenMP or any of the existing shared memory parallel programming libraries would have been much cleaner and portable.
This seems to be one of the common complaints in this discussion ("why not contribute to an existing project instead?") and others involving Apple's open-source participation, and certainly it has merit. But open-source developers write their own more-or-less redundant tools all the time with no better justification than "I didn't like how tool X handled the specific case Y, so I wrote my own tool from scratch" - SourceForge is littered with them. So (I'm responding generally, not to the parent) I don't see why Apple seems to be given an inordinate amount of grief over this - they didn't like how the existing stuff worked, so they wrote their own. Same thing they've done with launchd for that matter. Or moving away (somewhat) from gcc.
Also, having written their own tool, they released it under a license they preferred. To borrow a phrase frequently used in defense of the GPL - if you don't like the license, you're free to not use the tool and/or go write your own.
#DeleteChrome
That's great and all, but systems have been doing this for years. When I launch a thread on Linux I don't care where it ends up. The scheduler takes care of it. Same with Perl, pthreads, OpenMP and pretty much every other threading technology I've ever used.
What's new here?
That's great and all, but systems have been doing this for years. When I launch a thread on Linux I don't care where it ends up. The scheduler takes care of it. Same with Perl, pthreads, OpenMP and pretty much every other threading technology I've ever used.
What's new here?
With GCD, you don't "launch a thread". You "start a task", and how it is scheduled in a thread pool is up to the library. You don't muck around with locks, either - you define dependencies between tasks, and they're scheduled accordingly.
Why don't you just look at some examples of GCD use, and see for yourself? It really is much clearer to see the code in this case.
Microsoft has a very similar thing, by the way - Parallel Patterns Library - except that one is for C++ only, and uses C++0x lambdas rather than a (currently) proprietary C extension. But central ideas, and use patterns, are very similar.
I'm honestly surprised how ignorant and lazy the regular slashdotter has become with the years.
Any self-respected geek should be already keeping up to date with Apple advancements which are and will be impacting techology in the years to come.
If you people haven't noticed already, Apple has been consistently releasing libraries and server software as open source projects for the rest to pick up , use and modify, with liberal licenses.
A friend of mine used to say (can't remember exactly... paraphrasing:)
* Microsoft wants all software to be theirs
* GNU wants all software to be free
* BSD wants all software to be better
And releasing GCD, gentlemen, is another master stroke by Apple, just like WebKit, Bonjour, LLVM, the list goes on, to share knowledge and advance technology by merit, not by forcing it down your throat thanks to the monopoly you have been handed.
The term "block" is familiar to Ruby programmers. It's an old concept which Ruby has made easy to use and hence popular and actually useful.
And here's another lesson which OpenBSD, Apple and Ruby have been putting to work without you noticing guys: any technology that is difficult to use, no matter how good it is, will not be used if gets in your way; the technology must be easy to deploy/use and unobstrusive to be actually used and useful.
Just remember SELinux and how many people just disable it, no matter how good it is (which I don't think it is, but that's for another rant). Then compare it with the technology that OpenBSD has been implementing for memory protection which is unobstrusive and ready to use with no extra configuration. Same with Ruby blocks, which more programmers are using and a lot of software is benefitting from it now, even though higher order functions and closures have been around for ages.
Having Ruby-like blocks in C and Objective-C is so COOL, you must appreciate that if you think you're serious at programming. Apple has already submitted it to be a standard. I believe MacRuby will benefit from this too, which is Ruby written in Objective-C, which implements Ruby classes as Objective-C classes, achieving incredible speed, taking advantage of Objective-C and LLVM technologies.
Now, I want my late '90s Slashdot back please, where you could more easily find insightful and informative comments. There's a lot of garbage and Microsoft apologists nowadays.
The best way to predict the future is to invent it
This is all running on kernel 2.6.31, right?
What sort of explanation would you like?
A general overview for the layman?
A more thorough overview for the OS enthusiast?
A detailed overview from a developer's perspective?
That's exactly my point. With cooperative multitasking, if you know everything that's going on, it can be more efficient than preemptive multitasking. Just like manually managing your threads can be more efficient than something like GCD.
At the most fundamental level, cooperative is more efficient (if well programmed), just as manually managing threads (if well programmed) is more efficient.
It's that "if well programmed" that's the killer. If your environment is completely known, *and* you are skilled, you have the potential to do a better job by hand. But in the real world, were your software is going to run on all sorts of computers with all sorts of different software and processing capabilities, you can't fine tune your program, so letting the system handle it works better.
Looking at it differently, with cooperative multitasking or manually controlling threads, *if* you have complete knowledge of the system *and* you are sufficiently skilled, you can approach 100% efficiency. With cooperative multitasking, or thread management similar to GCD, you might reach only 90% efficiency. But if you have to build a cooperative multitasking or manually manages threads in a program, that is going to run on all sorts of computers, you may only manage perhaps an overall average 70% efficiency.
And the fact that you can get that 90% efficiency with much less effort than you have to put in to get maybe around 70% (or whatever) efficiency, the benefits here are obvious.
Blocks have some important properties that distinguish them from functions: they capture enclosing scope and can live beyond the lifetime of that enclosing scope. If the event handler that you want to put into a queue is an ordinary function pointer that has not captured enclosing state, you may do so. A type is provided: dispatch_function_t. Contrast this with dispatch_block_t while you're reading the API documentation. The very first paragraph under the heading "About Dispatch Queues" may help.
The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...