Apple Open Sources Grand Central Dispatch
bonch writes "Apple has open sourced libdispatch, also known as Grand Central Dispatch, which is technology in Snow Leopard that makes it easier for developers to take advantage of multi-core parallelism. Kernel support is not required, but performance optimizations Apple made for supporting GCD are visible in xnu. Block support in C is required and is currently available in LLVM (note that Apple has submitted their implementation of C blocks for standardization)." Update: 09/11 15:32 GMT by KD : Drew McCormack has a post up speculating on what Apple's move means to Linux and other communities (but probably not Microsoft): "...this is also very interesting for scientific developers. It may be possible to parallelize code in the not too distant future using Grand Central Dispatch, and run that code not only on Macs, but also on clusters and supercomputers."
Everyone who reads slashdot isn't an OSX ween and has no idea what "Grand Central Dispatch" is. Perhaps a sentence or two describing why it is important/useful would save users from following the link which doesn't provide that info either.
I want those 5 seconds back! :)
Can somebody explain what this "blocks" is? I mean, C being a block-structured language, I thought it already supported them...
I'm not too well versed in Cocoa development. I pushed some code that should have been in a separate thread into GCD, which requires you to use a block. All in all, I had to add an include, 1 line of code and a closing bracket.
Apple has made some seriously cool stuff here.
the question is:
What license? Apache v2.0
What the fuck is GCD?
Grand Central Dispatch (GCD), named for Grand Central Terminal, is used to optimize application support for multicore processors. It is an implementation of task parallelism based on the thread pool pattern .NET Framework developed by Microsoft.
GCD works by allowing specific tasks in a program that can be run in parallel to be demarcated as blocks.[2] To this end, it extends the syntax of C, C++, and Objective-C programming languages.[2] At runtime, the blocks are queued up for execution and depending on availability of processing resources, they are scheduled to execute on any of the available processor cores[2] (referred to as "routing" by Apple).[3]
see also
# Task Parallel Library - comparable technology in the
# Java Concurrency - comparable technology in Java (also known as JSR 166).
IranAir Flight 655 never forget!
It's a library for task parallelism using a thread pool, introduced in Mac OS X 10.6 (Snow Leopard). Wikipedia tells all.
"We recognize that libdispatch is a new technology and you likely have many questions. Here are some documentation resources for getting started:
Introducing Blocks and Grand Central Dispatch
Concurrency Programming Guide
Grand Central Dispatch (GCD) Reference"
Blocks:
In Snow Leopard, Apple has introduced a C language extension called "blocks." Blocks add closures and anonymous functions to C and the C-derived languages C++, Objective-C, and Objective C++.
Perhaps the simplest way to explain blocks is that they make functions another form of data. C-derived languages already have function pointers, which can be passed around like data, but these can only point to functions created at compile time. The only way to influence the behavior of such a function is by passing different arguments to the function or by setting global variables which are then accessed from within the function. Both of these approaches have big disadvantages
Full Read: http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/10
Directly in line with blocks is Grand Central Dispatch (and this is, where blocks become really usefull):
GDC is a a technology to resolve the concurrency conundrum by giving programmers a very easy way to split tasks into multiple sub-tasks which can then be loaded onto different threads/cpu. All this also works with normal threading, but GDC makes the process far easier, with the intention to prepare OSX for future multicore machines:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/12
It does so by using blocks as separate tasks:
http://arstechnica.com/apple/reviews/2009/08/mac-os-x-10-6.ars/13
"When I first heard about Grand Central Dispatch, I was extremely skeptical. The greatest minds in computer science have been working for decades on the problem of how best to extract parallelism from computing workloads. Now here was Apple apparently promising to solve this problem. Ridiculous.
But Grand Central Dispatch doesn't actually address this issue at all. It offers no help whatsoever in deciding how to split your work up into independently executable tasksâ"that is, deciding what pieces can or should be executed asynchronously or in parallel. That's still entirely up to the developer (and still a tough problem). What GCD does instead is much more pragmatic. Once a developer has identified something that can be split off into a separate task, GCD makes it as easy and non-invasive as possible to actually do so.
The use of FIFO queues, and especially the existence of serialized queues, seems counter to the spirit of ubiquitous concurrency. But we've seen where the Platonic ideal of multithreading leads, and it's not a pleasant place for developers.
One of Apple's slogans for Grand Central Dispatch is "islands of serialization in a sea of concurrency." That does a great job of capturing the practical reality of adding more concurrency to run-of-the-mill desktop applications. Those islands are what isolate developers from the thorny problems of simultaneous data access, deadlock, and other pitfalls of multithreading. Developers are encouraged to identify functions of their applications that would be better executed off the main thread, even if they're made up of several sequential or otherwise partially interdependent tasks. GCD makes it easy to break off the entire unit of work while maintaining the existing order and dependencies between subtasks." (source = above url)
Apple has a long tradition to make things its own way, not always in the cleanest way under the hood. This does not always help, and sometimes it just adds to the confusion. Why parallel programming has to be tied to a kernel change and to a language spec change, when a good library (OpenMP, anyone?, but I'm sure there are others) will suffice... Good support for OpenMP or any of the existing shared memory parallel programming libraries would have been much cleaner and portable.
Disclaimer: I use/program/work with an Intel Macbook Pro.
It looks like GCD is very similar to OpenMP. I am always biased toward using an open standard, when possible. Since many compiler vendors support OpenMP, why didn't apple just implement that for Objective-C, instead of creating their own threading solution? Judging from the examples, GCD looks much cleaner and simpler. But that often comes with a price.
Well, I can see the logic of your concerns but Apple do actually seem to be fairly good at open sourcing infrastructure-related things. Sure they maintain a tight control on the user-facing stuff that makes Apple products distinctive - on the iPhone they even maintain a tight control on applications. But bear in mind they're working on an open source kernel, employ developers to work on the LLVM compiler (open source), open sourced an init-ish daemon (launchd) they developed, etc etc. On stuff that's "for geeks" they seem fairly enlightened wrt open source.
It's quite surprising from a company like Apple but the fact that they manage to make surprising decisions like that looks like a strong technical management team at work, to me.
Why is it so remarkable to you that Apple open sources one of their technologies? They already have several open sources projects which they use in their software, such as Darwin and Webkit. They also use lots of 3rd party open source technologies in their software. I don't think they're doing all of this because they're somehow violating licenses, but because they just think open source isn't so bad.
Pretty good is actually pretty bad.
Apple has a long history of open source contributions since OS X. Apple has released parts of OS X as Darwin under a BSD license since they first released OS X. OS X was developed from OPENSTEP which came from NextStep which itself was based on BSD. The kernel itself is derived from XNU which was based on the Mach kernel. All of these components are covered by BSD licenses. From what I understand since Apple uses a lot of open source Unix programs like CUPS, etc, they do contribute to fixes and patches on a regular basis.
Well, there's spam egg sausage and spam, that's not got much spam in it.
no need for kernel support
open source implementation
yeah. sounds a helluvalot like vendor lock-in to me
You don't think that libdispatch will be very genial to widespread usage, as it has a lot of OS-specific calls, which is an understandable position to take. But as an alternative you offer something whose "only caveat" is that it needs an entirely different compiler to build. A compiler whose most recent activity dates from two years ago.
... How is that a superior alternative?
GCD is designed for small, short-running blocks of code. Read the Ars article on Snow Leopard for examples. Naturally, it will also handle longer-running threads gracefully.
Whoever modded you informative is as ignorant as you.
The usual way: anything that I know how to use is obviously far superior to everything that you know how to use.
If you were blocking sigs, you wouldn't have to read this.
Locked into open source software? I can think of worse things.
I wouldn't be surprised if a lot of the way it was designed is to work with Objective-C, due to it being OS X's primary language.
Apple actually owns CUPS now. But, yes, they still make it available under an OSS license.
Big parts of Mac OS X are open source, including the kernel, the compiler and some basic libraries. Now including LLVM, Clang and GCD.
Absolutely true, but I want to point out that Apple's just trying to make simple things easy -- they're not trying to say that they've finally solved truly capital-h Hard concurrency problems.
lorem ipsum, dolor sit amet
There is no tech that can help you with that. Maybe Apple should solve the solvable problems first (making parallelizing easy) before tackling the unsolvable ones?
What you meant to say was, "Everyone who reads Slashdot isn't an internet weeny, and has no idea how to find out what a technology of which they haven't heard before might be, other than imploring (counterproductively with a snide remark) to be spoon fed by other participants in the discussion." The answer is the wonderful invention, Google. You can type a word or phrase, like "Grand Central Dispatch", and Google will present you with links to document which discuss it. Oftentimes, one of those documents might appear at the web site of an online community encyclopedia, known as Wikipedia, which not only describes the thing in question, but also frequently provides convenient links to additional information.
If you mod me down, I shall become more powerful than you could possibly imagine.
You and I must not be reading the same code.
libdispatch constantly references mach throughout, which is unique to OS X and would take ages to #ifdef out completely. It also makes use of features only available in GCC 4.2.x or later, which means that many older platforms are out in the cold. It also makes a ton of platform assumptions (though luckily many of these are #define'd, so they can be redefined for other platforms later, however there's plenty that isn't, and is coded specifically for Intel platform performance). Furthermore, it codes in some really weird Apple-specific things (like label names for the priority queue levels, #ifdef's for Cocoa support, other very strange code, not clean in anyone's definition).
With some significant work by an interested party, it could be made portable, but Apple didn't Open Source it with the idea that it'd immediately build on non-x86, Linux, or Windows (and in fact, it might be ages before it builds on the latter, using pthreads and functions like sysctlbyname(3) throughout). It could be portable... some day. But as it stands it's not. At all.
Shoot, I already noticed the difference on my 2.5 yr old Mac Pro (1.1). First boot on 10.6 and I was like "wow, feels like a new machine again". All of the bundled apps have been recompiled (64 bit) and cleaned up (and apparently take advantage of GCD everywhere possible). I really didn't think I would see that much of a difference with 10.6 and really only upgraded because I could for $29 (I mean at that price, why not right?) I am very happy with my $29 purchase thus far. I've only had to work through a couple app incompatibilities (and as I have been able to work around them just fine, I am happy.) This is of course just my experience thus far with 10.6. I have no hard benchmark numbers for you. But I noticed right away the smoothness it brought to my older Mac Pro. And it was an easier upgrade than going from 10.4 --> 10.5.
If they were violating the GPL they probably wouldn't have released it under the Apache 2.0 license..
You are posting in a thread about the fact that Apple made their implementation open source and you are claiming vendor lock-in?
Are you one of those rabid Apple-haters we see so often around here? Or are you just amazingly stupid?
No, GDC is purely a "C with block extensions" API. The blocks are essentially anonymous methods you can pass directly in to functions. It's integrated at a much lower level than Objective-C, which is only used for the higher-level application framework in MacOS.
Apple open-sourced the GDC API with this announcement, block extensions to C with LLVM implementation last week, and the OS support necessary as part of the xnu kernel Darwin release for 10.6.
E pluribus unum
I'd say adding anonymous functions to C is more than just 'big sounding'.
I know the C-like language geniuses won't jump on Erlang immediately, but the multi-core support is awesome. I'm pretty sure there's a port for every platform too.
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
What I really want to see next is libxgrid so that I can use my debian/windows boxes as Xgrid nodes.
My university had around 20 computer labs, some were packed, but some were completely empty. I understand that it'd use more energy, but if you could turn those into a cheap 'super computer' and loan/rent out time to groups on campus.
If functions can now be data passed around as variables, doesn't this reduce the security of an application? Granted, you don't need to use blocks but i can envision a future world where internet server software is written heavily in blocks. What if a block makes it's way into the system from the outside world? Its the same thing as a code injection hack but this technology could potentially make it much more simple to do that. Yet another security concern.
Given the new-ness of the technology it seems like some of these security issues need to be worked out... patterns developed and such.
This technology strikes me as a solution for user experience issues and not bullet proofed for server solutions. I may be wrong.
Here is the list of all the open source packages apple uses. These include the kernel and CUPS (under the 10.6 tab), as well as their own modified version of other open source packages like java or gcc (under the developer tab). Contrary to a lot of a the common "apple is teh proprietary satan!!1" posts on slashdot, apple acts just like you might expect a more or less decent proprietary unix distributor to act: they open source what they can but keep closed whatever they feel is necessary to maintain a competitive advantage against Microsoft or that would infringe on hardware sales.
P.S. As in interesting tidbit, you'll notice that clamAV is posted there as well. Hmm, makes you wonder.
Gentlemen! You can't fight in here, this is the war room!
Not the whole problem, but actually, yes, some of the problem is with languages.
And some problems are not hard to multi-thread, and yet reasonably competent programmers end up sometimes screwing it up anyway. They make mistakes not because they're stupid, but because (for example) they're in a hurry. (All professional programmers are in a hurry. Ship it! Next work order!!)
There are ways we can make that situation easier.
Ah, the old "power is dangerous" argument. You're right, and you're wrong, at the same time. Or rather, you are correct and irrelevant. Foot-shooters' problems are their problems. Your problems are your problems, and you don't want to have them. It's ok to make things easier for you. Don't worry about That Other Guy fucking up.
"Believe me!" -- Donald Trump
Erlang is pretty cool, but I doubt the linux kernel or any other massive c codebase is going to be rewritten in it. GCD is an incremental improvement, that will be more likely to have a bigger impact.
Well.. maybe. Or Maybe not. But Definitely not sort of.
What if you're running two applications that both are capable of monopolizing all your cpu time? How will your app know that it's only going to get 50% of the available cpu time form the OS, so it should only start threads for half the cpus?
GCD decides how many threads a collection of tasks should be split across. If an app running on an 8-core machine wants to run 100 tasks, then they could be spread across anywhere from 1 to 8 threads, depending on what else is running. Since it's the OS that knows what else is running, it can make more intelligent decisions about how many threads should be running.
Grand central dispatch has many innovations, but the key feature it provides is that thread pooling is now handled by the OS not the program. This means that in a dynamic environment you don't have each application stepping on each other when they ask for too many threads --all total-- than the multi-core system can optimally handle. So if Mail asks for fifty threads and Firefox asks for fifty threads and CPU you are running on can realistically only handle 10 threads then GCD figures out how to manage things so you don't get a spinning beachball.
It turns out a lot of tricks were required to do this including a lot of things like just in time compiling LVM and this C-Blocks stuff, but that's way over my head.
Some drink at the fountain of knowledge. Others just gargle.
SMP = Symmetric Multi-Processing. GCD, in theory, will also allow access to Asymmetric Multi-Processing as well since you can take advantage of GPU resources and cores as well.
As a GPL advocate, the GP can't fathom why a company like Apple would willingly release code under a free license so that the community can benefit. He has been living by his apparently false belief that we need the over-complicated GPL (along with its inherent incompatibilities with other freer licenses) to force companies to give back. He's been conditioned to believe that the GPL has high value and utility in keeping code free, when in reality companies do give to non-GPL licensed free software projects like Apache PostgreSQL without being coerced.
This author takes full ownership and responsibility for the unpopular opinions outlined above.
I've come to really like GCD; I haven't played with it much in Cocoa (Obj-C) but I've been moving some of the stuff I wrote a long time ago in C to use it and I think I can say that what it does is *really* *really* awesome. It helps when writing code to be run in parallel; it does is not help you in determining *what* should be done in parallel. By putting your work into queues, by way of closures (yeah, blocks, whatever...I'm sticking with the closure name), it's up to the underlying OS to determine what thread gets what work, and on what processor. Having worked with multithreaded stuff on Windows, and calling GetThreadAffinityMask or whatever it was, and being told that it's just a *hint* to the OS, which is free to ignore you, which it always did, GCD really does spread out the work evenly among my 16-proc MacPro, and then turns around and does it just as well on the dual-core mini.
I've wanted something like this for years; a really decent OS thread scheduler that divides up the work on the other processors in a sensible fashion. I was even looking into how much effort it would take to write something like this from scratch for Linux, and now I don't even have to. Sweet!
Caveats: This is in OS X only, so no iPhone GCD (at least, not yet...not really necessary until we have multi-core iPhones), and while I've lived with additions to C++ through the years (templates mostly), the idea of adding, well, anything to C seems strange, let alone something as run-time dependent as closures.
Don't forget they've contributed a lot to WebKit, which now seems to be used by every new browser that comes out.
So you're saying gcc is being used for vendor lock in?
I agree. Furthermore, these "high level language" things are just dangerous toys that let overconfident children shoot themselves in the feet more quickly. It should be the law that if you want to program a computer you have to use a real language - Assembly.
multi-core != SMP. see: PS3.
As in interesting tidbit, you'll notice that clamAV is posted there as well. Hmm, makes you wonder.
It's used in OS X server, where it integrates with the mail service in order to filter/block emails containing known viruses.
GCD doesn't support using GPUsâ" that's what OpenCL is for. That said, there were some nice demos at WWDC where, for example, a solar-system modelling tool (tracking the gravitational movements of a zillion objects in real-time) was rewritten by adding first GCD then OpenCL. Using GCD to offload calculations to other threads in parallel made quite a difference, then OpenCL just blew the lid off. It was SCARY just how much difference it made. And the nice part was that GCD gave a nice performance boost by adding a couple of lines here & there to wrap little bits of long-running code in calls to dispatch_async().
It's quite surprising from a company like Apple...
Companies are defined by what they do. If a person surprised Apple is releasing technologies as open source projects, that just means that person has an inaccurate image of what kind of company Apple is and should pay more attention to what Apple does and less to espoused, unsupported opinions from astroturfers and zealots.
I don't mind progress, and new standards and all that, and the idea of a "user-friendly scheduler" is really nice, but how hard would it be to make this work with just generic callable objects? it's not that hard to implement a closure in C, and it's been done for years for things like boost and libsigc++ (any signal/callback system that doesn't have upvalues is useless to me at this point). And it's not like these "blocks" are actually compiled and linked at run time, it's just a pointer to a static function with a bunch of extra data on it. If the API just took a callable object (which could be implemented as a "block" if the functionality was there), there'd be no need to wait for some committee to get together and approve an addition to a standard.
--
Stay tuned for some shock and awe coming right up after this messages!
The vender's (Apple's) compiler? You realize that you're talking about gcc, right?
Please tell me I was not the only one who was expecting a railway signalling system....
You are posting in a thread about the fact that Apple made their implementation open source and you are claiming vendor lock-in?
Are you one of those rabid Apple-haters we see so often around here? Or are you just amazingly stupid?
I must be amazingly stupid because I rather like Apple products.
Proprietary extensions are done for (arguably) the same reason by Microsoft; the goal should instead be to work on better iterations of language standards (C/C++) and not on introducing arbitrary language extensions that are not portable across compilers - especially not really extremely awkward ones like 'anonymous function pointers.' There's a similar argument to be made for 'encouraging' developers to use C# and Objective C.
The compiler (LLVM) is open source. Apple is a big contributor to the project, but it's not their compiler.
LLVM will be the "standards based alternative". GCC is ugly and old, the whole point of LLVM is to replace it - generally, not just for Apple.
Companies are indeed defined by what they do - the majority of what Apple does in the marketplace is provide tightly integrated "Just Works" experiences by maintaining a strong control over what they do. They also maintain intense secrecy on all their products. Do you dispute anything specific that I said?
At the end of the day *nothing* is really unexpected on some level since there are unseen laws and logic governing basically everything that happens. But if you see a trend in behaviour from somewhere (e.g. tight control and secrecy in most of Apple's most visible things) and then that trend is apparently violated, that qualifies as "surprising" in my opinion.
Good point, that's actually a biggy and I shouldn't have missed that.
At one point one of the KDE devs pointed out that, contrary to most folk's expectations, interaction between KHTML wasn't happening because Apple were - at the time - complying with the letter of the license but not really providing patches in a useful form. The KDE guy was basically saying "No, really, we're still on our own with KHTML" to uninformed commenters, rather than necessarily saying Apple was being bad.
After that, Apple addressed these concerns and massively improved (AFAIK) their community interaction over WebKit. Pretty impressive in my view. Openness and responding effectively to criticism is not the kind of thing Apple is usually known for - but unlike some companies they do often seem to understand that OSS works better as a two way street, rather than just going "OMG free developers!".
I must confess after reading comments here and the Wikipedia article, I'm not sure exactly what's novel here.
The "blocks" concept is nothing more than a repacked version of futures as far as I can tell. C++0x has anaymous functions and futures libraries for C++ already exist. I can see value in adding this for C but I sure hope Apple isn't thinking of trying to introduce a competing proposal for C++. We're way beyond that stage.
And I really don't get the connection to LLVM and jitting. What's the advantage?
As for the programming model, futures or OpenMP seem a much better way of expressing these kinds of things and they exist today. Better yet would be automatic parallelism extracted by the compiler when possible. It's limited to loops mainly, but anything to relieve the programmer from specifying this stuff is a win.
As for the scheduler, it seems Apple has simply decided that kernel threads are more flexible than user threads. That may be true, but this is an old argument. I just don't see the novelty.
Can someone enlighten me? An efficient implementation of threads for MacOS is certainly important so I'm not denying that the technology is important. But I don't understand the hype about it being somehow revolutionary.
It also adds anonymous blocks to C, which is pretty radical. Read the Ars review on Snow Leopard, and skip ahead to the GCD parts - simply put, it is quite awesome.
GCD seems to be little more than an implementation of Intel's Thread Building Blocks adapted to Apple's platform:
http://www.threadingbuildingblocks.org/
So, in a sense, you as Linux users already have it.
This technology is very valuable to massive single-threaded application such as legacy or poorly-implemented UNIX applications. This shouldn't be necessary on the Microsoft platform, which has been fully multi-threaded as a standard practice since the 90's.
If Apple has made the technology more developer accessible, then this will be a valuable contribution to the Linux platform.
In short, Apple is giving it away because it has no proprietary value. They probably just want free labor for maintaining their fork of TBB.
Yes, obviously the OS X kernel already manages the threads. The thing that GCD manages is creating the threads based on current system work loads and the tasks that you give it. If you give it 1000 tasks on a 4 core machine and the FLASH plugin is hogging 100% of one core(as it does a lot), GCD may start 3 threads and when you quit the web browser, it will likely start another thread.
In general with most multithreaded programs, you probably start 1 thread per core because you can reasonably assume that this will perform pretty well if there's not much else going on. GCD is running at the OS level so it knows whether there is load on one core and if the other cores are free. Thus it will not start too many threads which will cause more context switching and degrade performance.
What I'm more interested in is the overall productivity. I'm tired of seeing operating systems treat threads like multiple cores. An i7 is not an 8 core cpu! There is a difference between threads and cores.
When you look at how threads can share cache with each other it becomes obvious that threads can potentially become more productive than cores especially with tasks that need such power.
So the in the end the question in my mind with GCD is, "Does it identify how much memory each queue needs and if the queues share memory? Does GCD do this to manage everything organizing what goes to what thread allowing a speedup especially when needed? Or does it treat threads the same as cores like everything before it?"
My current computer only has a core 2 duo in it so I can not properly run tests but the second I get an i7 I'm going to run multiple tests with GCD, openCL, openMP, ... and see if anything can properly take advantage of threads managing the cache properly. Maybe this is an over the top openCL type task but regardless I wouldn't mind looking at GCD in detail to understand how it figures out the best way to manage the queues.
imho memory management can be the best speedup, not how many numbers can be processed in a given time.
I'm honestly surprised how ignorant and lazy the regular slashdotter has become with the years.
Any self-respected geek should be already keeping up to date with Apple advancements which are and will be impacting techology in the years to come.
If you people haven't noticed already, Apple has been consistently releasing libraries and server software as open source projects for the rest to pick up , use and modify, with liberal licenses.
A friend of mine used to say (can't remember exactly... paraphrasing:)
* Microsoft wants all software to be theirs
* GNU wants all software to be free
* BSD wants all software to be better
And releasing GCD, gentlemen, is another master stroke by Apple, just like WebKit, Bonjour, LLVM, the list goes on, to share knowledge and advance technology by merit, not by forcing it down your throat thanks to the monopoly you have been handed.
The term "block" is familiar to Ruby programmers. It's an old concept which Ruby has made easy to use and hence popular and actually useful.
And here's another lesson which OpenBSD, Apple and Ruby have been putting to work without you noticing guys: any technology that is difficult to use, no matter how good it is, will not be used if gets in your way; the technology must be easy to deploy/use and unobstrusive to be actually used and useful.
Just remember SELinux and how many people just disable it, no matter how good it is (which I don't think it is, but that's for another rant). Then compare it with the technology that OpenBSD has been implementing for memory protection which is unobstrusive and ready to use with no extra configuration. Same with Ruby blocks, which more programmers are using and a lot of software is benefitting from it now, even though higher order functions and closures have been around for ages.
Having Ruby-like blocks in C and Objective-C is so COOL, you must appreciate that if you think you're serious at programming. Apple has already submitted it to be a standard. I believe MacRuby will benefit from this too, which is Ruby written in Objective-C, which implements Ruby classes as Objective-C classes, achieving incredible speed, taking advantage of Objective-C and LLVM technologies.
Now, I want my late '90s Slashdot back please, where you could more easily find insightful and informative comments. There's a lot of garbage and Microsoft apologists nowadays.
The best way to predict the future is to invent it
And even use an async accumulator across them
http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html
I'm going to play with this later today. Even if the current version's performance isn't awesome, it lays the foundation for some really nice boosts by allowing you to express what you want to the compiler better.
Systems like GCD from apple have been around for quite some time.
IBM Visual Age Smalltalk has had something like GCD for at least a decade. Smalltalk has had blocks since 1972.
"When I first heard about Grand Central Dispatch, I was extremely skeptical. The greatest minds in computer science have been working for decades on the problem of how best to extract parallelism from computing workloads. Now here was Apple apparently promising to solve this problem. Ridiculous."
Yes, ridiculous since Apple didn't invent it nor did they point out to you all the pit falls of using it. Remember they are a marketing organization and all their publicaly available materials paint a happy wonderful veneer on it.
The truth is that concurrency control is difficult and concurrency control via something like GCD is just as difficult and avoids none of the real serious problems.
I spent a year and a half debugging an in house application for a very large fortune 500 company that made use of this sort of capability for dispatching parallel work. Much of that time was tracking down problems in block oriented concurrency in a very large code base. Programmers assumed that what they were doing was safe but in the end those assumptions often turned out wrong. Data being changed by multiple threads is a serious problem. Locks are needed to avoid data corruption in many cases. That's just one example. Very nasty situations developed that required some very smart people to figure out and find solutions for, and we had a team of the very best. Eventually we identified the issues and found solutions. It took eight experienced developers over a year focused on fixing bugs that were derived from naive application of block oriented parallelism.
Look, if all you have are very simple cases you're home free with GCD. That is nothing new of course as it is with any concurrency. The challenge comes in when your code base grows, changes and your simple cases are no longer simple. The assumptions change.
So learn to use GCD wisely, it's not a magic silver bullet like it's being made out to be by Apple's marketing department. Oh and remember that C isn't as powerful a IDE as Smalltalk so you'll have your work cut out for you debugging any serious concurrency issues in your applications that use GCD, especially if they are huge applications.
I do welcome blocks to C, finally some expressive power that Smalltalk has had since 1972!
Click the link... one of the first things that will hit you in the head is the line saying "Apache License, version 2.0".
"Civis Europaeus sum!"
Blocks themselves are Objective C Objects* (which is to say the first element in the structure is a pointer to their Class). That's mostly a convenience, much like how CoreFoundation are toll-free bridged as NSObjects.
* Most Objective C Objects (other than string constants) are created on the heap; blocks are initially created on the stack and only moved to the heap if necessary.
Do you even lift?
These aren't the 'roids you're looking for.
Who identifies the work units which can run in parallel? A compiler or the application programmer? Who identifies the decomposition of the application data model into the work units and their state sharing and synchronization? A compiler or the application programmer? Who identifies the strategy for gathering work unit results back into the sequentially consistent form required after they all complete? A compiler or the application programmer?
It seems the programmer identifies the work units(called "object blocks") and has explicit control over synchronization, but gives up control over scheduling those tasks. A decent doc at http://developer.apple.com/mac/articles/cocoa/introblocksgcd.html
The new thing here seems to be the system-wide scheduler for these work blocks. Rather than just having the OS manage threads, this allows the OS to schedule short-lived tasks on whatever resources it has. I don't know if there has ever been system-wide management of light-wieght threads before... or if it is a good idea.
ClamAV is included in OS X Server. As is tomcat, and many other bits you can find on the web page, that you can't find in your OS X desktop.
Apple also hosts and continues to support CUPS, the Open Source printing system, a project they bought from the developer that owned it. http://www.cups.org/
What does the GNU D Compiler have to do with any of this?
The queue implementation also looks like it imposes a lot of overhead, so it is not very useful for parallelizing short-running "blocks" of code.
Actually, it's extremely good at parallelizing short-running blocks of code.
Unless your program is going to run on a very specific and unchanging computer configuration with all other tasks known and well-defined, GCD will pretty much *always* do a better job and handling and prioritizing threads.
Any performance lost in overhead is regained many times over in threads not bogging down the system. It's a bit similar to comparing cooperative and preemptive multi-tasking. With preemptive, there is system overhead in it having to manage all the various processes, but cooperative is so inefficient that the overhead is miniscule compared to the gains in the vast majority of scenarios.
How is this any worse than writing code that uses GCC extensions? I bet you've already done that without realising it. Well, either that or you've never written a line of compiled code in your life.
From what I've seen, Apple definitely got a good deal out of the WebKit community interaction - there's a huge number of people doing bug reporting, snapshot testing, and submitting patches as a result.
This is in OS X only, so no iPhone GCD (at least, not yet...not really necessary until we have multi-core iPhones)
I disagree. As a former Scheme programmer I find it a bit maddening to be so close to have closures as a tool again, yet so far... after all, blocks/closures are a nice way to tell GCD just what you want scheduled, but there are plenty of other great uses for closures in day to day programming where you're not even thinking in parallel.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Why do you think cooperative threads inefficient? Is it because people use them inefficiently?
Cooperative threads, when used correctly, are more efficient than pre-emptive because with pre-emptive, you have to save and restore all state to context switch (kernel preempts a process, and doesn't know which registers are in use, so has to save them all). With cooperative, you can context switch at function boundaries and not have to save the callee-save registers.
If moderation could change anything, it would be illegal.
And it's not like these "blocks" are actually compiled and linked at run time, it's just a pointer to a static function with a bunch of extra data on it.
Not true, blocks can use any variable from the calling context and it will be incorporated at runtime. These are not just function pointers to static methods.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
Wow, this is tremendously ugly. Futures seem a better way of doing this:
#define COUNT 128
double sum = 0;
future double x[COUNT];
for(i = 0; i < COUNT; ++i) x[i] = complex_calculation(i)
for (i = 0; i < COUNT; ++i) sum += x[i];
The async accumulator thing looks confusing to me. With futures the accumulation becomes part of the spawned task.
They also maintain intense secrecy on all their products. Do you dispute anything specific that I said?
Yes - they only have intense secrecy around END PRODUCTS.
But "product" can be thought of another way, in the various parts that make up the whole. While an end user "product" may be secretive many of the components used to make it are all as open as anything can be - BSD subsytems for the OS, Darwin for the kernel, Webkit for Safari (and used across many "secretive" apps like iTunes). GCC and now LLVM, LaunchD, and now GCD and also OpenCL (which I believe they are submitting to a standards body too).
At the end of the day *nothing* is really unexpected on some level since there are unseen laws and logic governing basically everything that happens.
It's not that much of a mystery. Apple either takes open source things and builds custom apps atop them, or builds custom apps atop technologies that it then open sources. It's not like we're talking about lamb entrails here folks, Apple has been openly and consistently doing this for many years now.
"There is more worth loving than we have strength to love." - Brian Jay Stanley
That's exactly my point. With cooperative multitasking, if you know everything that's going on, it can be more efficient than preemptive multitasking. Just like manually managing your threads can be more efficient than something like GCD.
At the most fundamental level, cooperative is more efficient (if well programmed), just as manually managing threads (if well programmed) is more efficient.
It's that "if well programmed" that's the killer. If your environment is completely known, *and* you are skilled, you have the potential to do a better job by hand. But in the real world, were your software is going to run on all sorts of computers with all sorts of different software and processing capabilities, you can't fine tune your program, so letting the system handle it works better.
Looking at it differently, with cooperative multitasking or manually controlling threads, *if* you have complete knowledge of the system *and* you are sufficiently skilled, you can approach 100% efficiency. With cooperative multitasking, or thread management similar to GCD, you might reach only 90% efficiency. But if you have to build a cooperative multitasking or manually manages threads in a program, that is going to run on all sorts of computers, you may only manage perhaps an overall average 70% efficiency.
And the fact that you can get that 90% efficiency with much less effort than you have to put in to get maybe around 70% (or whatever) efficiency, the benefits here are obvious.
What if you're running two applications that both are capable of monopolizing all your cpu time?
How often does this actually occur? And if it does and the processes are threaded, won't a smart OS scheduler just migrate the threads so that each application essentially has a CPU (or four) to itself?
I think the big win is more likely that this makes threading on Mac easier. That said, I really hate the way they did this; putting high-level language features like lexical closures in C is just ugly. C is a glorified assembler, and should stay that way :-)
If moderation could change anything, it would be illegal.
If other compiler vendors don't pick it up (and they won't with a standards-based alternative)
What standards-based alternative do you have in mind? C++0x? You do know that's a C++ standard, right? Blocks are anonymous functions for plain C.
The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...
Obviously you have no respect for prior research done by respectable scholars, and you think everything under the sun (made by Apple) is new.
I once had a signature.
Apparently in the example, it is neater to show short code as opposed to long code, but it doesn't mean it is designed to run short code.
I once had a signature.
As someone with experience responding to retards, in particular those with inflated self-evaluations of competency.... the above comment is garbage. It's absurd that the above post could be useful to anyone evaluating libdispatch. In particular, it ignores the forgone conclusion that porting this to linux or windows would entail the standard macro preprocessor magic to replace the mach calls with equivalents for other target platforms. Anybody actually evaluating libdispatch has seen ported C. Many times. While it's good to have a library that has already been ported, looking at the above comment, I think you'd have to be an idiot to think that the Mach primitives could not be conditionally compiled with #ifdef. Not to mention C++0x is a C++ standard, and this is a C library.
The "cue the foo posts in 3, 2, 1..." posts will commence with no subsequent foo posts in 3, 2, 1...
I think really we're debating over semantics at this point - my original post was trying to point out that whilst Apple is known for its secrecy and control it is pretty open on infrastructure stuff, where it makes sense for them.
All I meant by "surprising" was that it's unusual to see a company that's known for tight integration and control but which also has another greatly contrasting. I'm not personally surprised by Apple's behaviour since it's consistent with what I've seen from them in the past. But I can see why it would be surprising to someone who had only seen the more visible retail face of Apple.
Basically, libdispatch just creates a thread pool for each separate task, then uses some clever magic involving an inter-process semaphore to keep them blocking so that no more than enough threads (ie: the number of CPUs) are running at any given time. Nifty, because it means little change needed to be made to xnu. libdispatch is also, theoretically, portable to other platforms, as long as one can provide a blocks runtime and a compiler capable of handling blocks. I noticed a patch on LLVM's mailing list today providing a Linux port of the blocks runtime, and llvm-gcc and clang both are capable of handling blocks and running on Linux.
haha! weiner you mean Osama
Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
Why the heck did they release it under an Apache 2 license, and not under a regular BSD license?
{{.sig}}
Is the Cilk runtime available for OS X? If not then why consider it on OS X?
Companies are indeed defined by what they do - the majority of what Apple does in the marketplace is provide tightly integrated "Just Works" experiences by maintaining a strong control over what they do.
I disagree, slightly. Apple does, indeed, use the business model of providing a polished end user experience, often limited, but easy to use for what it does. They sometimes do this by maintaining a lot of control over all the components, but not in all cases. In other cases they do this by partnering with other companies and organizations or by building a product that integrates well with the existing ecosystem. Controlling all the components is only one of their methods of implementing that business plan.
They also maintain intense secrecy on all their products. Do you dispute anything specific that I said?
Absolutely. Apple completely documents and voluntarily provides the source code for large portions of many of their products. That is certainly not maintaining secrecy about them. Rather, Apple is famed for being secretive about new and upcoming products that have not been released. This is for two reasons. First it allows them to make an effective marketing splash and draws interest from the press. Second, Apple makes a lot of money by innovating and by maintaining secrecy they maximize the time to market advantage they have over those who copy their innovations.
That said, such secrecy as Apple has demonstrated has little to do with whether or not Apple releases source code to products after it comes to market. That's an action they regularly take, so to not expect them to take it, you have to be misconstruing what apple normally does.
At the end of the day *nothing* is really unexpected on some level since there are unseen laws and logic governing basically everything that happens. But if you see a trend in behaviour from somewhere (e.g. tight control and secrecy in most of Apple's most visible things) and then that trend is apparently violated, that qualifies as "surprising" in my opinion.
Ahh, but if you're perceiving Apple as being secretive about how their products work and about the source code it just means your perception is inaccurate. This, in turn, means you should look at your preconceptions and information sources. Apple releases the source code for the 600th or 700th project and that's a surprise? Apple releases the source to a foundational new technology they implemented in OS X? That might have been surprising when it was the original release of Darwin, or even when it was the libc and io stuff they made from scratch and gave away the source to subsequently. But come on already. They've been releasing the source to underlying technologies they add to OS X, with every release for over a decade now. Grand Central is exactly the type of technology that Apple leverages the open source model for, being as it benefits from more eyes and more contributors and the more people that use it the better software developers will get at making applications multiprocess well on OS X. It should be no surprise at all to someone with an accurate perception of how Apple operates as a fairly good, mixed OSS community player, that understands OSS and regularly leverages it for their products.
Frankly, I think your perception of Apple has to be pretty messed up to be surprised by Apple releasing the source to GCD. The reason this is news is because of the idea that other OS's might be able to integrate it and because it is potentially so useful. Most of the time what Apple releases isn't even news. Heck, they released a dozen packages for their underlying security framework in case anyone wants to make use of their version of sandboxing.
I guess the long and short of the matter is, if you're surprised by this, do you just have an anti-Apple bias and why do you have that?
Erlang is very cool, but it is not designed to replace C. In fact, it is designed to handle some bits of the higher level concurrency stuff and call out to C "drivers" for level work. Apparently Ericsson's switch code has almost as much C/C++ code as Erlang code. GCD addresses concurrency in problem spaces Erlang is completely inappropriate for, just like Erlang plays in spaces that GCD is not appropriate for. They are different tools for different jobs.
You're misinterpreting what I originally meant in a fairly extreme way. Did you look at the context I was commenting in?
To be fair, perhaps I could have phrased it better. When I said "products" I should probably have said "retail products" to disambiguate from stuff (like open source code) that they produce but don't market to consumers.
I used "surprising" in the sense that "It's unusual to see a company that has both extreme secrecy and enlightened openness". "Surprising" in general, then, as opposed to surprising for a fully informed observer. I wasn't surprised personally, since I've been following Apple's open source activity for years, including quite obscure stuff (for instance, I understand they sponsored some Linux work back in the day). However, the person I replied to *was* personally surprised - I was trying to explain that whilst I understood his surprise, it was not indicative of anything fishy.
If you reread my post, you'll see that I commented that Apple do open source quite a lot of infrastructure stuff and that I believe that maintaining those seemingly contradictory stances is an indication of strong management. I was replying to somebody who was suspicious of the motives of Apple in this - and pointing out that although it might look odd it's actually quite normal behaviour for them.
I was aiming to be evenhanded and was, after all, defending them against accusations of copyright infringement. If a defense of the company qualifies as anti-Apple, then I'd hate to meet an Apple hater!
Don't be a sore loser, nobody said it was a new concept. I did call you a loser though, chew on that.
Right, that's clearer now.
But consider this: you don't have to know everything that's going on in order for cooperative threads to work, as long as none of the threads block (non-blocking i/o) and the compiler enforces a yield() every so often. This can all be part of the libraries and compiler you use.
It wouldn't help spread the load across multiple CPU's, but I think cooperative threading is unfairly maligned :-)
If moderation could change anything, it would be illegal.
The async accumulator is adding the values into the sum as they are computed rather than waiting for them to complete in order like they do in my first example and your future example. This means you don't have to have an array and only have to have one block on the main thread for the sum, rather than 128 blocks for each value in the array. You need an upvalue to do this, and it seems most future implementations don't have them because it makes the syntax hard to read.
Since we are just adding the complex calculations together, the first GCD and future examples are probably pretty much tied on speed. GCD only has one block on the main thread, where the future example gets to add numbers as they come in. The future example would take a penalty for continually waking the main thread just to add a number.
Do futures explicitly spawn tasks, or is it just lazy evaluation?
Don't blame me, I voted for Baltar.
All you need with futures is an atomic operation.
Granted, this is a simple case but to do anything more complex would likely require explicit synchronization for any mechanism.
Ultimately we're talking about syntax and readability here and compiler-generated complexity always trumps user-written complexity.
Yes, they spawn tasks. The idea of futures is to add some syntactic sugar and let the compiler generate the code to spawn the tasks and do the waiting on the results.
I would add "powerful" to your points (which I guess are limited to the desktop):
* Mac OS X: usable and powerful (great UI + great foundation)
* Windows: just plain convenient, thanks to the size of the install base and people familiar to it
* GNU/Linux: powerful, but not usable
Being said that, I'm actually using the three of them at work:
* GNU/Linux for the people that is responsible for a few and very specific tasks for which Ubuntu has been customized. :)
* Windows for the yet-to-be-converted PC because of in-house systems or 3rd party software that require Windows and is pending of getting an alternative
* Mac OS X for people that know better. Which means the IT department
The best way to predict the future is to invent it
These are hard to read about, I'm having some trouble finding links. Could you post one or two here?
Don't blame me, I voted for Baltar.
The native compiler has all the needed data, and can be as skilled as you make it.
I know tobacco is bad for you, so I smoke weed with crack.
Wikipedia has an ok article here. It has a link to the C++0x standard library implementation of futures which is somewhat limited. It has many of the same problems as the blocks concept - namely, it requires too much programmer interaction. To be done properly, futures should really be understood by the compiler so it can generate all the boilerplate.
I'm most familiar with the Cray XMT implementation of futures as described here. To briefly summarize, a future is a variable linked to the output of some asynchronous operation, usually a function call. When the future's value is set, a thread is spawned to process some task that produces the right-hand side of the future's assignment statement. Later on, when the future is actually used, there is a compiler-generated synchronization point. If the task is complete the parent thread just continues on as normal. If the task is not ready, the parent thread waits at the point of the future's use until the value becomes available. The future's thread can come from anywhere: a thread pool, a new spawn, etc.
So essentially futures are a natural way to express the concept of some side task producing a value needed "sometime later." The XMT has special hardware to assist in the efficient processing of threads and futures but one can implement futures on any machine that provides a threading model. It's a very nice abstraction that allows the programmer to get out from under the gory details and concentrate on the problem being solved.
Garbage collection is robust enoughâ" the block gets copied to the (garbage collected) heap when it's passed into dispatch_async() or similar calls automatically, and it uses scanned memory to do so, meaning the collector tracks a refcount on 'self'.
In non-collected situations, it's up to the developer to ensure that this doesn't happen somehowâ" for instance by retaining 'self' before the first dispatch_async() and releasing it inside the last block.
Not to mention that extra 10+ GB of free space :)
SWM seeks new sig for a brief fling