How Snow Leopard Cut ObjC Launch Time In Half

← Back to Stories (view on slashdot.org)

How Snow Leopard Cut ObjC Launch Time In Half

Posted by Soulskill on Sunday September 6, 2009 @02:33AM from the paring-down dept.

MBCook writes "Greg Parker has an excellent technical article on his blog about the changes to the dynamic linker (dyld) for Objective-C that Snow Leopard uses to cut launch time in half and cut about 1/2 MB of memory per application. 'In theory, a shared library could be different every time your program is run. In practice, you get the same version of the shared libraries almost every time you run, and so does every other process on the system. The system takes advantage of this by building the dyld shared cache. The shared cache contains a copy of many system libraries, with most of dyld's linking and loading work done in advance. Every process can then share that shared cache, saving memory and launch time.' He also has a post on the new thread-local garbage collection that Snow Leopard uses for Objective-C."

17 of 158 comments (clear)

I thought this was the shared libs always worked by noidentity · 2009-09-06 02:43 · Score: 3, Interesting

I thought one of the points of shared libraries was that the files could simply be mapped read-only into the memory of each process and executed directly. This would only require one physical copy in memory, even though it may exist at multiple logical addresses.
I take it since then shared library machine code has had to be patched in memory after it's loaded for a while now, thus preventing easy sharing among processes, and causing the page to need its own space in the swap file.
Sounds like this latest improvement effectively brings things back to the way they were, by effectively writing this patched version back to disk so that it can be mapped read-only as before, and not have to be patched every time the library is loaded into a process. It's odd, because I thought the OS already did this several versions ago when prebinding.
Re:enough fucking by Anonymous Coward · 2009-09-06 02:44 · Score: 5, Funny

You've had enough of fucking, and would like more Snow Leopard stories? Each to his own, I guess.
Re:I've heard that before.... by BorgDrone · 2009-09-06 02:44 · Score: 5, Informative

Did you even read the article ? Suppose not... this is slashdot after all.
The article states that prebinding (similar to prelink) was used in previous versions of OS X and has been replaced by a much faster shared cache.
Re:I've heard that before.... by mdwh2 · 2009-09-06 02:47 · Score: 3, Funny

Well okay then, Apple were the ones who "popularised" it! ("Well I hadn't heard about Superfetch, but I heard about Apple doing it first, therefore, Apple did it first")
Or um ... they "integrated" it better. Yeah, that's it.
Re:enough fucking by lysergic.acid · 2009-09-06 02:50 · Score: 4, Funny
But I haven't even had a chance to submit these yet:
dyld by DoofusOfDeath · 2009-09-06 03:04 · Score: 5, Funny

dyld - noun. A reminder that regardless of age, you'll always have an adolescent sense of humor.
Re:I've heard that before.... by CharlyFoxtrot · 2009-09-06 03:08 · Score: 3, Interesting

It's nothing like Superfetch. Superfetch preloads applications into system memory and this shared cache doesn't do that instead from what I understand it preforms some of the work the linker would do on load in advance.

--
If all else fails, immortality can always be assured by spectacular error.
Re:I've heard that before.... by lysergic.acid · 2009-09-06 03:22 · Score: 3, Informative

Well, the FP is claiming that they're the same thing, and your post seemed to be agreeing with him. But yea, that troll rating probably should have gone to the post you replied to.
I don't know anything about prelink, but Superfetch sounds completely different from dyld. Superfetch keeps frequently launched applications in memory to make them launch faster (much like Winamp Agent does for Winamp). dyld, OTOH, shortens application launch times by not reloading a shared library each time an application is launched. Keeping the shared library loaded in a shared cache also reduces the number of copies of that library you need loaded in memory. It doesn't sound like Superfetch does that.
Both a turbocharger and a cold air intake can improve car performance, but that doesn't make them the same thing.
Re:Doesn't sound like this is loading apps. by Ma8thew · 2009-09-06 03:24 · Score: 4, Informative

Objective-C is not equivalent to Cocoa. Cocoa is a set of frameworks written in Obj-C and primarily used by Obj-C programs.
Re:I've heard that before.... by Simonics+Zsolt · 2009-09-06 03:32 · Score: 3, Insightful

And maybe kdeinit does something similiar since 2003?
Re:Commen Sense Sharded Library by Anonymous Coward · 2009-09-06 03:47 · Score: 3, Interesting

Me thinks you (and many other readers) are mistaking this feature for more traditional static dyld caching.
This enhancement is actually about caching a runtime computation for Objective-C purposes. In practice, as the linked article indicates, this computation is consistent most of the time. In some cases it is not. So to handle the general and most common case, these computations (selector uniquing) are cached and used across different processes.
So the fair question is does Linux cache selector-uniquing?
Re:I've heard that before.... by plsuh · 2009-09-06 03:51 · Score: 5, Informative
Moderators, please mod the parent down -- it completely misses the point.
Objective-C selector uniquing caching is NOT the same as Windows Superfetch.
Objective-C uses a two-phase dispatch for method calls. When you see a call in the Objective-C source code that looks like:

[myObject init];
the dispatch system:
1. Looks up the function pointer for the method "init" in a table.
2. Calls the "init" function via the function pointer.
The problem arises in the method dispatch table when you have multiple methods named "init" -- which is very common. When an application is loaded the dynamic loader ("dyld") needs to separately identify all of the methods named "init" (and any other methods with conflicting names) that apply to different classes. This is done by "tagging" each method in the dispatch table, a process called selector uniquing.
Now, this has to be not only for the application binary itself, but also for any Objective-C classes in shared libraries that are loaded. Almost all apps on Mac OS X load the libobjc.dylib library, which is cached to improve performance. As a part of the caching process, Snow Leopard now does the selector uniquing only once, and then stores the uniqued selectors in the cache. Thus, any application that links against libobjc.dylib (or any other library that is in the cache) only has to unique its own selectors, not those of the library as well. This significantly reduces the amount of overhead for launching an application compared to previous versions of Mac OS X.
This process does not attempt to retain application binary code in memory in the face of page-outs as Superfetch does. Selector uniquing caching speeds application launch times by reducing the amount of computation that has to happen at launch, not by pre-loading the application's binary.
Thread-local garbage collection is NOT the same as Windows Superfetch.
Thread-local garbage collection is a third phase of garbage collection added on top of the Objective-C 2.0 garbage collection system, which speeds up the garbage collection system even further. By concentrating GC to what has occurred in a single thread, the GC system can delay and reduce the cost of a slow global sweep even beyond the generational GC algorithm.
Windows Superfetch is a response to poorly written software.
To quote from the Wikipedia article:

The intent is to improve performance in situations where running an anti-virus scan or back-up utility would result in otherwise recently-used information being paged out to disk, or disposed from in-memory caches, resulting in lengthy delays when a user comes back to their computer after a period of non-use.
In my opinion as an experienced application developer the user should never run into the problem that Superfetch attempts to solve. Anti-malware scans or backups are generally limited by I/O transfer rates, not by CPU. In such situations, using lots of memory to pre-load data makes no sense. It is relatively easy to write a two-buffer, threaded, streaming system for situations that are constrained by disk transfer rates without consuming scads of memory.
In the bigger picture, Superfetch attempts to learn the times of day when apps are used and pre-loads their binaries. This is a nice concept, but I have serious doubts as to how useful it really is. The penalty for guessing wrong is fairly high, and users are more tolerant of consistent small slowdowns than they are of occasional long hangs (see the Mac literature on the spinning beach ball).
Mac OS X is less likely to need such anti-malware scans in the first place as the application binaries are now digitally signed by the developer. Any malware that attempts to insert itself into applications will run into problems. This is not to say that the Mac is immune -- I can think of a number of holes that could be exploited (such as the fact that unsigned binaries w
Re:enough fucking by TheRaven64 · 2009-09-06 04:21 · Score: 4, Insightful

Comparing this to Superfetch is ignorant beyond belief. Superfetch is part of the paging system on Windows and attempts to trigger page faults before the data is actually needed so that it's already cached when it is needed. This is quite a nice feature and one I am naturally prejudiced to like because my PhD was in this topic.
This is entirely different. Part of it is similar to the existing prebinding / prelinking stuff in Leopard / Linux, which generates the relocation tables in position-independent code. This is nothing like anything in Windows, because Windows doesn't use position-independent code for shared libraries (it uses a horribly ugly hack which performs better in the best case and much worse in the worst case). The article is a bit too light on details to understand exactly why the new version performs so much better.
The other half, however, is very clever. By caching the selector uniquing information, they are saving a lot of time when loading compilation units containing Objective-C code. Even better is the fact that, because these symbols are now not modified, they can be shared between processes without triggering copy-on-write faults. This isn't actually that hard to implement for the GNU runtime; just give the selector symbols mangled names and mark them as having common linkage (it's a bit harder on Darwin because Mach-O is weird), then you can use pointer comparison as a first step in the runtime and avoid the strcmp() call. Combining this with the prelinking support and you get the caching for free, which is very nice. I actually implemented this in Clang while writing this post, so expect to see it on non-Apple platforms soon too.

--
I am TheRaven on Soylent News
Re:Doesn't sound like this is loading apps. by INT_QRK · 2009-09-06 04:33 · Score: 3, Interesting

What impresses me significantly is that instead of concentrating on glitzy and often useless new "features," Apple actually implemented substantive performance enhancements. The import of this approach can't be praised enough in my view. Anecdotally, I recovered 6 GB of hard drive space, and immediately experienced noticeably zippier launches since yesterday's upgrade. My MacBook Air on Snow Leopard loaded on feels almost as nimble as my old IBM T-41 that operates on Ubuntu 9.04. Holy cow, this is no small thing. Just, good on them!
Re:Commen Sense Sharded Library by lurch_mojoff · 2009-09-06 04:44 · Score: 3, Insightful

Does Linux need selector uniquing if it doesn't use Objective-C?
No it doesn't. Since the average executable on linux is static code linked to dynamic libraries made up of static code, you get your "selector uniquing" at compile time - you don't get a method selector description, instead you get a pre-calculated and already unique address of the method or function.

To me this sounds like an inefficiency in Objective-C that made it less efficient than C++ (the other OO flavour of C) has been improved somewhat.
It is a tradeoff. You get to worry about the performance of shared library selector uniquing, but you get all the benefits of dynamic language and runtime. In practice such inefficiencies matter most in cases where you are very constrained for resources - e.g. on a phone, as hinted in TFA. I doubt in the context of the rest of the performance and efficiency improvements in Snow Leopard and on a reasonably modern computer, the 1/10 of a second or the few megabytes of memory saved matter all that much.
Re:I've heard that before.... by TheRaven64 · 2009-09-06 05:23 · Score: 4, Informative

selectors, which I believe can't be prebound (for you java programmers, these are equivalent to interfaces - C/C++ does not have this concept and instead allows direct access to the classes using protected or public)
I'm sorry, but this and most of the rest of your description is completely wrong. Selectors are nothing like Java interfaces. Interfaces are Java's version of Objective-C Protocols. Selectors are abstract method names (Smalltalk calls them symbols). Each Objective-C class has some data structure mapping these to function pointers. When you send a message (call a method) you look up the function pointer corresponding to the selector in the receiving class. To make this fast, all selector comparisons are done as pointer comparisons. To make this work, the runtime needs to make sure that selectors are unique. This process involves building a large hash table and inserting every selector referenced by every compilation unit into it. By making the linker handle this uniquing, you have several advantages. The first is that the resulting table can be shared more easily between processes, resulting in a memory efficiency gain. The second is that the runtime can first try doing pointer comparison when registering a new selector, and only use the hash if the linker didn't unique the selector.

--
I am TheRaven on Soylent News
Re:Apple made a rod for their own back with Obj-C by mgbastard · 2009-09-06 05:46 · Score: 3, Insightful

Right, because obviously the ultimate evolution of computer languages for all time is C and C++. There's never any need to further innovate that technology whatsoever.
Are you @#$@ kidding? It wasn't that funny.
I take issue with the assertion that nobody ever caught on with it. GNUStep? NeXT has been around for something like 15 years in industry now. EDS and others used it. Ross Perot was so impressed he invested in it and because a director at NeXT. It has a very feature rich set of frameworks associated with it, depending on your OS deployment. The only thing that sucks is Apple dropping OPENSTEP / Obj-C for Windows. But Steve didn't care about the enterprise market anymore at the time, and it might have eroded some mac hardware sales, and you couldn't very well charge a license for it. (I disagree, I think you could and can)

--
Anyone seen my low uid? last seen 10 years ago while panning the #@$# out of Taco's 'web based discussion system'