Former Sun Mobile JIT Engineers Take On Mobile JavaScript/HTML Performance
First time accepted submitter digiti writes "In response to Drew Crawford's article about JavaScript performance, Shai Almog wrote a piece providing a different interpretation for the performance costs (summary: it's the DOM not the script). He then gives several examples of where mobile Java performs really well on memory constrained devices. Where do you stand in the debate?"
An arbitrary tree of arbitrary objects with arbitrary methods of manipulating them thanks to years of incremental changes that are never properly documented (quick! point to the document showing that a select tag has a value attribute!) and are never deprecated.
I think Drew's article wasn't that performance had to suffer, it was that garbage collection isn't free. It has to take place, though, so it's not an O(GC)=0 component. If garbage collection takes place in a lot of memory, but not -enough- memory, it takes a very long time, in real time. Depending heavily on the application, it may be very visible at the UI level.
Programmers intent on using all of the resources available, and performing intensive tasks, should think about means other than garbage collection.
While this post is a valuable addition to Drew's analysis, I feel it's not really a rebutal at all.
Yes, JavaScript is slow for the reasons Drew mentioned and yes, the DOM is a nightmare to optimize for responsive UIs. They're both right.
While this blog also provides some nice insight into how you can have acceptable performance with a GC on mobile, it's not offering any workable alternative that would work for JavaScript. So Drew's article still comes out pretty strong, IMO.
I dislike the separation of 'Perceived' vs 'Actual' performance. If I perceive it to be slow, it's slow. This reminds of of the Firefox devs that spent years saying how if an add-on makes their browser a memory hog and a slowpoke, it's not their problem because their performance is fine.
Devs.. If it's slow, it's slow. Call it perceived, call it actual, call it the Pope for all I care. It's a Slow Pope.
Crawford brought in lots of data on real-world performance. (e.g. http://sealedabstract.com/wp-content/uploads/2013/05/Screen-Shot-2013-05-14-at-10.15.29-PM.png)
Almog's rebuttal has a lot of claims with no actual evidence. Nothing is measured; everything he says is based on how he thinks things should in theory work. But the "sufficiently smart GC" is as big a joke as the "sufficiently smart compiler", and he even says "while some of these allocation patterns were discussed by teams when I was at Sun I don't know if these were actually implemented".
Also:
I'm a professional game programmer, and I'm laughing at this. If you're making Space Invaders, and there's a fixed board and a fixed number of invaders, that statement is true. If you're making a game for this decade, with meaningful AI, an open world that's continuously streamed in and out of memory, and dynamic, emergent, or player-driven events, that's just silly. For Mr. Almog to even say that shows how much he doesn't know about the subject.
If they care about performance why not use C?
From the first article:
From the second article:
The first article has original research with measurements. The second article has "its not a big deal". The first article links to research papers. The second article has "its not a big deal".
With an open mind, I know which one I'm more inclined to believe.
Perhaps it's a difference between throughput and latency. Nine moms can make nine babies in nine months, offering nine times the throughput of one mom, but each baby still takes nine months from conception to completion. Users tend to notice latency more than throughput unless an operation already takes long enough to need a progress bar. Some algorithms have a tradeoff between throughput and latency, which need fine tuning to make things feel fast enough.
There are also a few ways to psychologically hide latency even if you can't eliminate it. The "door opening" transition screen in Resident Evil is one example, hiding a load from disc, as are some of the transitions used by online photo albums to slowly open a viewer window while the photo downloads.
The separation is useful to understand where the optimization is necessary. JavaScript could be made less painful if it didn't have DOM manipulation to contend with obviously that's not very practical.
If they care about performance why not use assembly?
Some people do (in a sense) use assembly when they use C. If a particular inner loop is running too slowly, an expert programmer might look at the code that the compiler generates to see what's going wrong. For example, an inefficient translation of C to assembly language might point to needing const or restrict qualifiers to allow the compiler to make certain optimizations.
I guarantee Javascript will perform much better once we get to 16 cores and 3.6Ghz on the standard mobile device.
Join the Slashcott! Feb 10 thru Feb 17!
So they want to JIT Javascript? Traversing trees is standard stuff, why should the DOM be so much slower?
I actually agree with almost everything Drew wrote with the exception of his GC statements, I worked for an EA contractor in the 90's doing large scale terrain streaming on what today would be a a computer less powerful than an iPhone so while my game programming experience might be outdated its still valid. Saying that I don't know if its actually implemented only referred to the last section. Like I said, I actually worked on the VM code as well as the elements on top of it. As I said in the comments to the article never might have been harsh but I pretty much stand by it. If you use a GC you need to program with that in mind and design the times where GC occurs (level loading). Most of your dynamic memory would be textures anyway which are outside the domain of the GC altogether and shouldn't trigger GC activity. To avoid overhead in a very large or infinite world you swap objects in place by using a pool, no its not ideal and you do need to program "around" the GC in that regard. OTOH when we programmed games in C++ we did exactly the same thing, no one would EVER allocate dynamic memory during gameplay to avoid framerate cost.
I find it very amusing that we are having this conversation on a website that just deployed the slowest suckiest mobile website I have ever seen.
This whole conversation should have been retitled "Why web apps are slow on mobile", and not about JavaScript at all.
The comparison between Objective C and Java is totally ridiculous and beside the point; it's the only thing which ties this article to actual mobile Apps instead of web apps, and fails to address the original articles comments on JavaScript.
In principle, Objective C the language can be used for dynamic binding; in practice, the Objective C runtime, as represented in crt1.o, and in the dyld and later dyle dynamic linkers, it can't be. This was an intentional decision by Apple to prevent a dynamic binding override from changing aspects of the UI, and to prevent malicious code being injected into your program - this is a position Apple strengthened by adding code signing.
Comparison to Java of Objective C as a proxy for comparing iOS applications to Android applications is also ridiculous: there are Android native apps, they are just more difficult to write than the Dalvik Apps. Angry Birds, and most media applications, are not run under the Dalvik VM, but are instead run on the native instruction set.
The whole argument made by the article author is not a rebuttal; it's an attempt to twist the conversation in an attempt to beat on his personal hobby horse.
It says that JavaScript is inherently slow because of DOM. It says that this should not be applied as a sweeping generalization to all managed languages e.g. Java. Then it gives examples including mobile Java performance where small heap devices work just fine.
The article mostly agrees with what Drew said with very few exceptions. The article points at Asha devices (and other devices) that have very small amounts of memory (2mb) and yet perform really well with a GC (and a slow CPU).
The GC study pointed in the Drew article was a desktop study taken out of context.
Do we want them too?
Java client's are not notoriously strong in the performance department, period. I mean this is why my Blu-ray player sucks as a Netflix client because its Java based. Also no popular "smartphone" created in the last 6 years uses Java as a front end so the reason while mobile devices improved in performance is because those companies avoided using Java in the first place.
Sure, maybe some throwback clamshell feature phone might run Java and perform well, but you are hardly playing Angry Birds or doing anything more then trivial on a clamshell feature phone. There is a reason why everyone abandoned Java based phones the moment iPhone came out.
Also I'd rather have nobody from Sun/Oracle touch Javascript for fear they will lock it down and then sue everyone else because they changed something and suddenly decide they own Javascript.
I haven't thought of anything clever to put here, but then again most of you haven't either.
The reason why older small embedded devices seemed to work better with Java is because they were using first-generation garbage collectors. Modern compacting collectors use additional space to optimize the speed of collection. But in the grand scheme of things this is a classic time-space tradeoff. The older embedded devices were actually slower running Java overall, but there was no cliff when working set hit a limit of 1/3 or 1/2 of memory, so perceived latency was better overall.
Fact of the matter is, there's no free lunch with a tracing collector. Step back and think about what it actually has to do--walk every object in your entire application, over and over and over and over and over, whether it was strictly necessary or not. And it can't know, of course, because you're not telling it. There are innumerable optimizations that can be made, but they're all constant factor, particularly when you consider a seemingly better than constant time improvement only shifted the cost to memory, or heuristic--awesome for some workloads but really bad for others.
Some people might like to drag up Azul's "pauseless" collector, but ask yourself why everybody hasn't adopted their technology? It's because a truly pauseless, concurrent collector requires extremely strong hardware transaction memory, far beyond what even Haswell provides. Azul originally implemented their GC using a customized processor with a customized hardware MMU. That never sold. Their new software "pauseless" collector is actually a virtual machine sitting below the JVM, which virtually implements in software an MMU with strong transactional guarantees. However, while latency is good, their implementation runs considerably slower overall. This is why they rarely, if ever, publish raw performance benchmarks, but only use cases showing how awesome their latency is.
Anyhow, there is no free lunch with GC. Period. If you out-source your garbage collection to a tracing collector, you're invariably leaving a ton of performance on the table, period. Many times, perhaps even most of the time, it's an okay deal. But don't pretend it's not costing anything, and there's no use fantasizing about a GC that will magically makes those costs go away. It's algorithmically not possible.
Would it ease the pain to have the normal, short living objects collected by a reference counter, but introduce a collected keyword for long living/cycled objects to be collected by a tracing collector?
Also: How does memory in Rust work? Is it like that? Can someone explain?
I'm making lots of money writing it, and having fun not worrying about memory.
Until you hit the performance / out of memory wall. When you have to write a serious application, and one day you will, if you do not mind how many resources your application is using will be "hasta la vista" to you.
Religion: The greatest weapon of mass destruction of all time
Thanks. The thing I was unsure about here is whether this is practical in a mobile VM with the added constraint of very limited CPU cache which is the really interesting bit here.
you misspelled frist!
Apps seem to launch instantly since they have hardcoded splash images
What? iOS-Apps are made to lie regarding their startup-performance? That's golden. 10/10.
At delivering malware payloads even in your adbanners. Yes everyone: The brilliant brainiacs that decided scripting documents was "smart to do" just ends up screwing you over in the end. Especially funny since they had the example of macros in documents (like MS Word ones) beforehand and saw how that ended up working out too in malware galore being delivered via those scriptable documents. Why not open up the door and let the trash come blowing in next! Might as well, since the material you consume comes loaded with it because of this idiot's move.
Example cases in point:
Malware More Likely to Come From Legitimate Sites:
http://techtalk.pcpitstop.com/2013/06/26/malware-more-likely-to-come-from-legitimate-sites/?know-notenough=
and
More dangerous to click on an online advertisement than an adult content site these days, Cisco said:
http://www.securityweek.com/easier-get-infected-malware-good-sites-shady-sites-cisco-says
is that you can still write a phonegap app that works on damn near anything you want it to in the time it takes the average Java team to tie their shoes and 95% of the time it's not going to perform adequately, it's going to perform better than what they tried to do in the native language. We can talk benchmarks until we're blue in the face, but in six years of web dev I've seen a lot of Java code bases and not one was competent. Modern JS JITs kick ass. Modern Java (and I'd argue C# as well) devs at the median level are pattern-swilling, enterprise-blinded, kool-aid snorting jackasses who can't write clean, minimalist, maintainable, good old fashioned DRY-lovin' code to save their lives.
Sun engineers can talk perf if it makes them feel better but they're missing the point. Decent JS devs do it faster, better, stronger, with smaller teams and robustly across a lot more platforms than equivalent-skilled Java devs could hope to in their wildest dreams. The reason? JS gives you all the rope you need to hang the shit out of yourself. When that happens you have two options. You quit. Or you learn more and you get better. And that's when you start noticing all the useful things you can do with all that rope that languages built with mediocrity-support in mind will never be able to do as efficiently.
And I'll be fair. When performance is really seriously hyper-critically important... you're better off with C (binds very nicely to JS via V8 might I add). Fuck Java.