Cache Optimization Now Made Easy, And Pretty
G3ckoG33k writes "Cache optimization has now been made easy, ok, perhaps easier... The guys working with memory management tool Valgrind (see previous story at /.) are now up to version 1.9.5, and it's stable! Even more, there is now also an excellent GUI tool for using Valgrind for serious cache optimization; check out KCachegrind!!!
Besides, who would have thought cache optimization would be not only intellectually but also visually beautiful?"
Shikari is excellent - as well as letting you filter out time spent in the kernel, or monitoring a specific thread within a process, it will also give you an instruction-level breakdown of any routines that look suspicious. This is extremely useful when you're trying to understand why a particular routine is slow.
Individual instructions are tinted from blue to yellow based on how expensive they were, you get a cycle count and %ge for each instruction within the routine, indications of where your stalls are, and awesome pop-up tips on suspicious behaviour (e.g., float->int conversions, redundant loads in a loop implying the compiler was being conservative with pointer aliasing, mixing double and float math, and a number of other PowerPC specific optimisation tricks).
Like any profiler you have to bear in mind that it may not be telling the whole story, but when you have a routine that you've know you need to care what the compiler is emitting then Shikari is like having a PowerPC assembly guru give you a quick rundown over your code.
Nae bother