Slashdot Mirror


Visualizing System Latency

ChelleChelle writes "Latency has a direct impact on performance — thus, in order to identify performance issues it is absolutely essential to understand latency. With the introduction of DTrace it is now possible to measure latency at arbitrary points; the problem, however, is how to visually present this data in an effective manner. Toward this end, heat maps can be a powerful tool. When I/O latency is presented as a visual heat map, some intriguing and beautiful patterns can emerge. These patterns provide insight into how a system is actually performing and what kinds of latency end-user applications experience."

7 of 68 comments (clear)

  1. The sky is falling... by PPalmgren · · Score: 4, Funny

    Informative article, all on one page, not chock full of ads. Now excuse me while I stock my bunker.

    1. Re:The sky is falling... by aicrules · · Score: 2, Informative

      All truly informative articles follow this paradigm. You only need the multi-page, multi-ad to pay for content that very few people will read because it's not that informative or interesting.

  2. old school visualization by bzdang · · Score: 5, Interesting

    Back in the day, working at an instrumentation company as a mechanical guy, I stopped to watch the senior electronic design engineer who was doing something that looked interesting. He had an old persistence-type storage oscilloscope hooked up to the rack-mount computer for a new instrument system and was watching the scope display, which was producing some fascinating patterns. Knowing f'all about this stuff but intrigued, I asked him to explain what was happening. He explained (and I'll butcher the explanation with layman's terms) that he was using d/a converters on the high and low bytes of the program address? to drive the x and y axes of the scope, and watching to see where, in the software, that the processor was spending much of it's time. He pointed to a hot spot on the scope display and said that this was where he would concentrate on optimizing his code. Fwiw, I thought that was pretty cool.

  3. Re:pretty graphs by ushering05401 · · Score: 5, Insightful

    These visualizations are used to condense the information gathered on one second intervals from running systems. Any graph of substantially advanced material is going to require explanation until you understand what is being measured, how it is being graphed, and how this information translates in real world performance.

    Of course a casual reader from the net needs to read text to understand what is going on. These aren't sales figure pie-charts and shouldn't necessarily be accessible for uninformed parties.

    On another note.. Do you think casual readers would have any more success interpreting the raw data files? Anyhow, I am interested in the technique as it is not one I am currently using. With a little practice this may be a good at a glance technique.

  4. Re:another solution to an already solved problem.. by forkazoo · · Score: 2, Insightful

    Really?

    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
    hda 0.00 0.00 114.85 0.00 0.45 0.00 8.00 0.73 6.28 6.34 72.87

    Where await and svctm are average wait (milliseconds) for the disk & queue and service time for the disk.

    Or do you mean something else?

    The data presented in the article are actually quite a bit more subtle and interesting than the summary data you've got there. It's probably be impossible to notice the effects of the "icy lake" phenomenon they describe with average summary data like that, or to appreciate the effect of shouting. (Most IO's happen relatively quickly during the shouting, so the average doesn't skew up very high. What's remarkable about the shouting is the sudden burst of outliers indicating a few accesses with terrible performance.)

  5. easy. by jd2112 · · Score: 2, Funny

    Take, for example, AT&T Network performance:
    Current: Snail
    Expected, after customers leave in droves over data plan changes: Snail on meth (see yesterday /. article)
    Expected, once AT&T upgrades equipment: Sloth on vallium

    --
    Any insufficiently advanced magic is indistinguishable from technology.
  6. Re:pretty graphs by azmodean+1 · · Score: 2, Insightful

    That's the point, a good engineer's (or scientist's) response to new data that they can't fully explain is generally unmitigated glee, it means they've found something new. My takeaway from the article is, "try this new technique/tool, you'll see new data".

    On another note, I've done some very basic analysis of disk performance at work, and this approach would have allowed me to be much more confident in my results. As it was, basically all I could do when comparing disks and filesystems was use iozone to characterize the "knee points" the article keeps mentioning, and try to map changes in aggregate numbers to saturation of various interfaces and/or devices. This method for actually getting sampling data for latency, and potentially from real workloads even, would have been extremely helpful.