Slashdot Mirror


Visualizing System Latency

ChelleChelle writes "Latency has a direct impact on performance — thus, in order to identify performance issues it is absolutely essential to understand latency. With the introduction of DTrace it is now possible to measure latency at arbitrary points; the problem, however, is how to visually present this data in an effective manner. Toward this end, heat maps can be a powerful tool. When I/O latency is presented as a visual heat map, some intriguing and beautiful patterns can emerge. These patterns provide insight into how a system is actually performing and what kinds of latency end-user applications experience."

68 comments

  1. another solution to an already solved problem... by Michael+Kristopeit · · Score: 1, Insightful

    mapping latency in a system using colored maps representing throughput has been a tool of db and network sysadmins for many many MANY years.

  2. wow, standards have fallen? by Anonymous Coward · · Score: 0

    I hope beutiful isn't commonplace...
    New management?

    1. Re:wow, standards have fallen? by Anonymous Coward · · Score: 0

      New management?

      What? No. You must be new here.

  3. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 1, Informative

    How bout you RTFA before you make you're smartass comments, since yours is almost a direct fucking quote from it. However, this isn't about measuring network latency, it's about disk latency, something that until recently was extraordinarily hard to measure.

  4. So shouting is bad by 0racle · · Score: 1

    I guess shouting at systems to make them start working has the opposite effect. Who knew a server was so emotional.

    --
    "I use a Mac because I'm just better than you are."
    1. Re:So shouting is bad by gazuga · · Score: 1

      Correct - see here

      --
      "I turn away with fright and horror from the lamentable evil of functions which do not have derivatives."
  5. Re:another solution to an already solved problem.. by Jorl17 · · Score: 1

    You're = your in this case. Other than that, nice comment.

    --
    Have you heard about SoylentNews?
  6. Relevance to Latency by headkase · · Score: 0, Offtopic

    Sorry, should have mentioned why this comment should be in this story. The flip side of measurement of your systems latency to improve performance is to optimize your program as well as the system. Pyke is a meta-programming framework and in effect caches your programs structure and variables dramatically improving performance. Which is latency.

    --
    Shh.
  7. Re:another solution to an already solved problem.. by Michael+Kristopeit · · Score: 0

    how bout i did read the fucking article? how about i never said it was about network latency? how about measuring disk latency is no different than measuring ANY latency? how about db latency using optimized table handlers ends up becoming exactly a problem of disk latency?

  8. The sky is falling... by PPalmgren · · Score: 4, Funny

    Informative article, all on one page, not chock full of ads. Now excuse me while I stock my bunker.

    1. Re:The sky is falling... by aicrules · · Score: 2, Informative

      All truly informative articles follow this paradigm. You only need the multi-page, multi-ad to pay for content that very few people will read because it's not that informative or interesting.

    2. Re:The sky is falling... by Anonymous Coward · · Score: 0

      That's because it's a direct link to a professional organization that is hosting the primary research ... not a journalistic website. Professonal orgs make most of their money from member dues and donations, and as such have no need for ads.

      The fact that it also happens to be an incredibly good article helps as well, obviously.

  9. Re:Here's the secret... by Anonymous Coward · · Score: 0

    Do you have any advice on how i can improve the taste of the santorum i felch from my boyfriend's ass after i jizz in it?

  10. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    However, this isn't about measuring network latency, it's about disk latency, something that until recently was extraordinarily hard to measure.

    Not on my planet. Those who forget history are doomed to repeat it. I've been making my living doing mostly performance work for fifteen years, including the production of countless graphs, animations, 3d visualizations, and other visual (and sonic) aids of metrics such as disk latency. And I'm not talking mrtg, and other crude views. Measuring the past is one thing, but predicting the future is much more interesting.

    Disk latency is just a tiny part of a much larger picture. The author of TFA is doing good work, and I fully support his writing about it, developing tools, etc. But it is nothing new and should not be represented as such. Except the bit about yelling at disk drives. That's very cool. That dude needs to be recognized for his uncanny ability to yell, and intimidate hardware. I would support funding further research into whether extreme profanity is of any additional benefit. That's an open source project I would get involved in!

  11. old school visualization by bzdang · · Score: 5, Interesting

    Back in the day, working at an instrumentation company as a mechanical guy, I stopped to watch the senior electronic design engineer who was doing something that looked interesting. He had an old persistence-type storage oscilloscope hooked up to the rack-mount computer for a new instrument system and was watching the scope display, which was producing some fascinating patterns. Knowing f'all about this stuff but intrigued, I asked him to explain what was happening. He explained (and I'll butcher the explanation with layman's terms) that he was using d/a converters on the high and low bytes of the program address? to drive the x and y axes of the scope, and watching to see where, in the software, that the processor was spending much of it's time. He pointed to a hot spot on the scope display and said that this was where he would concentrate on optimizing his code. Fwiw, I thought that was pretty cool.

    1. Re:old school visualization by harrkev · · Score: 1

      This must have been a long time ago, back when you had easy access to the address lines.

      Now, that same job would be VERY difficult! Most data accesses occur to data in the cache, which is not brought out to pins outside of the processor. And when memory accesses do happen, they happen over dedicated DDR address lines, which are very high speed (hard to probe), and the address lines are used to access both rows and columns, so some external circuitry is needed in order to determine what the real address is.

      Cool idea though, but pretty hard to pull off these days without a serious engineering effort.

      --
      "-1 Troll" is the apparently the same as "-1 I disagree with you."
    2. Re:old school visualization by snowboardin159 · · Score: 0

      sounds like a good strategy, and much like what TFA is doing. Id be interested to see something like this IRL, or streaming, or in graphs that were readable.

    3. Re:old school visualization by pipatron · · Score: 0, Troll

      OMG! Thanks for telling us this, I bet no one that knows what a computer program is knew this!

      --
      c++; /* this makes c bigger but returns the old value */
    4. Re:old school visualization by Anonymous Coward · · Score: 0

      Old, old, old school.
      But very cool.

      I once put the stack pointer to an area of memory mapped to the video buffer.

    5. Re:old school visualization by tuomoks · · Score: 1

      Yep, probably very long time ago, not that it was easy even in mainframes but very useful, no overhead to measure, as you maybe know - a mainframe is happy when 101% busy - the measurement overhead is very often a bad thing! It was fun, really, but reading the results wasn't always easy - is anything? Later on 80's / 90's simulating, estimating, measuring, etc file / disk / network systems the heat maps created with our hardware people on channels, controllers, disks, caches, DMA, etc timings / sizes / rates were indispensable - accurate within microsecond (memory nanoseconds) / no overhead / all the measure points you can dream. Common in all large software / system development - I wonder how it is done today, how many could use these devices / read the results correctly? Sometimes just wondering - have we lost some skills over years? Actually same heat maps can be used in networks very well - very useful! In wireless (and wireline but wireless usually has much more variations) networks you can see the latency, other problems and the performance easily with one look on a nice "heat map", maybe displayed on a 42" screen today.

  12. Re:pretty graphs by ushering05401 · · Score: 5, Insightful

    These visualizations are used to condense the information gathered on one second intervals from running systems. Any graph of substantially advanced material is going to require explanation until you understand what is being measured, how it is being graphed, and how this information translates in real world performance.

    Of course a casual reader from the net needs to read text to understand what is going on. These aren't sales figure pie-charts and shouldn't necessarily be accessible for uninformed parties.

    On another note.. Do you think casual readers would have any more success interpreting the raw data files? Anyhow, I am interested in the technique as it is not one I am currently using. With a little practice this may be a good at a glance technique.

  13. Re:pretty graphs by Anonymous Coward · · Score: 1, Insightful

    The article presented plenty of information related to it's topic. The topic was that using a heat map to describe latency is more useful than simple averages and maximums displayed as line graphs. The article then analyzed certain interesting cases were a heat map had information that would not have existed in a line graph. What you are griping about is that the topic itself is simple and that the article is full of individual analyses that provide support for the topic.

  14. Game Theory and other modeling by masterwit · · Score: 1

    After reading the article, this idea of a "heat map" or frequency distribution mapping (of sorts) can (sort of) be summed in:

    A particular advantage of heat-map visualization is the ability to see outliers.

    I find this particularly interesting as this graphically now allows a way to "filter" the real outlier out from a sea of data. Also,

    Instead of a random distribution, latency is grouped together at various levels that rise and fall over time, producing lines in a pattern that became known as the icy lake. This was unexpected, especially considering the simplicity of the workload.

    And concluding the section on what they dub as the "icy lake"...

    To summarize what we know about the icy lake: lines come from single disks, and disk pairs cause increasing and decreasing latency. The actual reason for the latency difference over time that seeds this pattern has not been pinpointed; what causes the rate of increase/decrease to change (change in slope seen in figure 5) is also unknown; and, the higher latency line seen in the single-disk pool (figure 4) is also not yet understood. Visualizing latency in this way clearly poses more questions than it provides answers.

    Without actually seeing the data or knowing the specifics of latency, from a pure mathematical standpoint I wonder what would result if one treated the set of numbers (from each disk) as a random sequence, identifying outliers (as they did using this heat model)...then graphically mapping those using a "chaos game theory algorithm". By using a graph to statistically analyze/visualize the "outliers", perhaps more could be revealed on the "randomness" of how one disk or a pair of disks reacts relating to the whole system.

    I do not claim to know much in this area at all and this is merely speculation on how the set of numbers "randomness" may be approached...

    --
    We should start a new Slashdot and return control to the geeks. It actually wouldn't be that hard to get some users to
  15. Sun by Anonymous Coward · · Score: 0

    Of course nobody credits Sun for Dtrace in the article...ugh

  16. Re:pretty graphs by Gothmolly · · Score: 1

    ACM is a scholarly, research-oriented group. If you're looking for spoon-fed, PCMagazine types of charts and graphs, look elsewhere. Lots of text generally means that someone with brains has to interpret the data, because the interpretation is non-trivial.

    --
    I want to delete my account but Slashdot doesn't allow it.
  17. Chill man. by Anonymous Coward · · Score: 0

    Woah, Michael. It's OK. It's just slashdot.

    Just kick back with a couple of these Ancient-New-World-Beers and chill out man...

    Know why you got that 0 next to your name? I could mod you down to oblivion but I think my beer comment is funnier. What's better is since I'm AC I still can.

    1. Re:Chill man. by Michael+Kristopeit · · Score: 0

      woah, yourself. you are NOTHING.

    2. Re:Chill man. by Anonymous Coward · · Score: 0

      Anger issues anyone?

    3. Re:Chill man. by Michael+Kristopeit · · Score: 0
      ignorancy issues, anyone?

      first you replace "your" with "you're".... it takes a special kind of moron to do that.... then you claim measuring disk latency is something that is in some way more difficult or different than measuring any other kind of latency.

      you are doing a service to yourself posting as AC to hide your identity... until that changes, you are NOTHING.

    4. Re:Chill man. by Anonymous Coward · · Score: 0

      I'm not pretending to be anybody else. I'm not the original AC that disagreed with you, and honestly I know nothing about latency, but nice attitude. I'm sure it serves you well in the real world...

    5. Re:Chill man. by Anonymous Coward · · Score: 0

      ignorancy issues, anyone?

      HAHA ignorancy. I missed that first read.

    6. Re:Chill man. by Anonymous Coward · · Score: 0

      ur mums face serves me well in the real world.

      disagreeing with someone that claims i'm wrong with statements undeniably false absolutely serves me well... EVERYWHERE.

    7. Re:Chill man. by Anonymous Coward · · Score: 0

      did you miss it because you're lazy or because you're ignorant?

  18. Re:Here's the secret... by X0563511 · · Score: 1

    Or female. What's your point?

    --
    For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
  19. DTrace is amazing, but... by lostsoulz · · Score: 1

    ...it's a shame that instrumentation of things such as EMC's PowerPath are a little painful. I guess there will always be gaps where vendor meets vendor and closed source meets open source, but it remains rather complex to analyse what's happening in Solaris with PowerPath and some Storage Foundation stirred in for good measure. Impossible? No...but maybe we'd all benefit from a little more interoperabilty?

    It's a great article though - Brendan's a DTrace authority is impressive.

  20. Re:another solution to an already solved problem.. by Colin+Smith · · Score: 1

    something that until recently was extraordinarily hard to measure.

    Really?

    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
    hda 0.00 0.00 114.85 0.00 0.45 0.00 8.00 0.73 6.28 6.34 72.87

    Where await and svctm are average wait (milliseconds) for the disk & queue and service time for the disk.

    Or do you mean something else?

     

    --
    Deleted
  21. Sure, but it'd work quite well by Anonymous Coward · · Score: 0

    in a simple microcontroller setup

    1. Re:Sure, but it'd work quite well by Anonymous Coward · · Score: 0

      You have not tried to snoop the address bus of the program memory on a simple microcontroller in the recent 20 years have you? Most microcontrollers today are basically "computers on a chip", internal flash to run program from, internal sram for data, internal eeprom for storing settings, internal peripherals for output.

    2. Re:Sure, but it'd work quite well by jnork · · Score: 1

      As a compromise, consider capturing the return address in a timer interrupt and stuffing into a DAC or two. Of course that requires that you have an unused DAC (or two) on board, a timer, and processing time available in the interrupt, and the result probably won't be as smooth as the DAC directly on the bus. Still, if you can do it it's better than nothing.

      If you can't, perhaps you can use a different peripheral and some external logic. SPI to a shift register, perhaps? Or I can see having a second processor and sending it serially, then the second processor outputs the DAC.

      Anyway, I can see using variations of this technique on some of my projects. Pretty cool idea, really.

      --
      Cleverly disguised as a responsible adult.
  22. At Queue, the sky is always falling... by davecb · · Score: 1

    I really like ACM Queue, which regularly prints articles for practitioners about things which both we and our more academic colleagues care.

    I recommend it, and on rare occasions, contribute.

    --dave

    --
    davecb@spamcop.net
  23. Effective? by Anonymous Coward · · Score: 0

    Summary of the article: The graphs are pretty cool, but we do not understand what they tell us.

    Thats not quite what I would call effective...

  24. Re:another solution to an already solved problem.. by forkazoo · · Score: 2, Insightful

    Really?

    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
    hda 0.00 0.00 114.85 0.00 0.45 0.00 8.00 0.73 6.28 6.34 72.87

    Where await and svctm are average wait (milliseconds) for the disk & queue and service time for the disk.

    Or do you mean something else?

    The data presented in the article are actually quite a bit more subtle and interesting than the summary data you've got there. It's probably be impossible to notice the effects of the "icy lake" phenomenon they describe with average summary data like that, or to appreciate the effect of shouting. (Most IO's happen relatively quickly during the shouting, so the average doesn't skew up very high. What's remarkable about the shouting is the sudden burst of outliers indicating a few accesses with terrible performance.)

  25. easy. by jd2112 · · Score: 2, Funny

    Take, for example, AT&T Network performance:
    Current: Snail
    Expected, after customers leave in droves over data plan changes: Snail on meth (see yesterday /. article)
    Expected, once AT&T upgrades equipment: Sloth on vallium

    --
    Any insufficiently advanced magic is indistinguishable from technology.
  26. Heat? by scottwilkins · · Score: 1, Funny

    Heat kinda makes me slow too...

  27. Re:pretty graphs by gumbi+west · · Score: 1

    Do you actually think the concept of a heat map is new?

    A great graph has, (1) a title, (2) labeled x- and y-axes, (3) a 3D figure should also have labels for intensity and the z-axis. All text should be readable or removed. Generally, difficult to interpret figures should have a paragraph below them explaining (a) what they show and--ideally--(b) what the researcher concludes form the graph when this is not obvious.

  28. Re:pretty graphs by gumbi+west · · Score: 1

    But almost all there concluding remarks on a figure are, "we don't understand this graph"

  29. Re:another solution to an already solved problem.. by Xeleema · · Score: 1

    Crimey! Slashdot needs a "Hide all comments from UIDs >1M"

    --
    "When I am king, you will be first against the wall..."
  30. Re:another solution to an already solved problem.. by Xeleema · · Score: 1

    the summary data you've got there

    Funny. I recall the command syntax for that one lets you setup intervals per second. That would be the "black foot" that gets you out of the "icy lake" phenomenon you describe.

    --
    "When I am king, you will be first against the wall..."
  31. Re:another solution to an already solved problem.. by Michael+Kristopeit · · Score: 0

    i agree that your children will probably be idiots not worth listening to...

  32. Re:another solution to an already solved problem.. by azrider · · Score: 1

    How about "Hide all comments where $UID > $MINE " ?

    --
    And ye shall know the truth, and the truth shall make you free.
    John 8:32(King James Version)
  33. Re:pretty graphs by azmodean+1 · · Score: 2, Insightful

    That's the point, a good engineer's (or scientist's) response to new data that they can't fully explain is generally unmitigated glee, it means they've found something new. My takeaway from the article is, "try this new technique/tool, you'll see new data".

    On another note, I've done some very basic analysis of disk performance at work, and this approach would have allowed me to be much more confident in my results. As it was, basically all I could do when comparing disks and filesystems was use iozone to characterize the "knee points" the article keeps mentioning, and try to map changes in aggregate numbers to saturation of various interfaces and/or devices. This method for actually getting sampling data for latency, and potentially from real workloads even, would have been extremely helpful.

  34. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    You're an interesting man, Michael. How's Rachel ? And the dogs ? I do hope they don't suffer from your anger issues. You might want to consider not putting quite as much about yourself on the internet, if you're gonna be doing the rabies thing at people; although I'd suggest a better alternative would be to get rabies shots.

  35. Re:pretty graphs by gumbi+west · · Score: 1

    uh, "a good engineer's (or scientist's) response to new data that they can't fully explain is generally unmitigated glee, it means they've found something new." perhaps, but generally you don't ask others to read about it until you understand something about the phenomenon OR it has withstood several attempts to understand it.

    You also wrote, "new technique" but what is new? Do you think they invented the heat map, or exploratory data analysis?

  36. Re:another solution to an already solved problem.. by Michael+Kristopeit · · Score: 0
    i made a statement of fact. you claimed i didn't read the article. you called me a smartass. you made claims about measuring disk latency that could not be more wrong. but most of all you said i was wrong.

    then i call you out on it USING THE SAME PHRASING YOU USED, and then you attack that... YOU WERE WRONG. you are probably wrong more than you are right, which has helped you develop this accusatory defense mechanism. i do have a wife. i do have dogs. you might want to consider putting a little more about yourself on the internet... perhaps then someone could find you and educate you, monkey.

    you are NOTHING.

  37. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    Hmm. It appears you have a hard time grasping the concept of "Anonymous Coward".

    "Anonymous Coward", dear smartest pencil, isn't an actual person. It's just another name for an anonymous poster. I'm not the Anonymous Coward you first exploded at, nor are either of us the one you exploded at second, and so on.

    I happen to be the one you just replied to, but that's just because you amuse me and I'm actually bothering to track that post for a few days.

    A lot of us may be NOTHING, but I guess that makes us mostly the same as what you've got behind the eyes, monkey. Have an education that doesn't involve wether single quotes are more efficient than double quotes in PHP.

  38. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    so your point is, anonymous posters should be ignored... and yet you yourself still post anonymously.

    you can't buy that kind of stupid.

  39. Re:pretty graphs by jgrahn · · Score: 1

    On another note.. Do you think casual readers would have any more success interpreting the raw data files? Anyhow, I am interested in the technique as it is not one I am currently using. With a little practice this may be a good at a glance technique.

    Yes -- I do things like these all the time, and I frequently have reason to go: "Oh, our systems behave like *this* in reality?" I wrote an article on the subject (with gnuplot as the visualization tool): http://snipabacken.se/~grahn/gnuplot_kicks_ass/

  40. Re:another solution to an already solved problem.. by Macka · · Score: 1

    That wouldn't leave me with very many people to talk to ;)

  41. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    Neh. My point was that Mr Kristopeit apparently didn't grasp that there was a difference between us Anonymous Cowards, and I suspect that an overdose of AC comments at him might be the cause of his outbursts.

    He's stopped replying, too, so I guess he got it. Well, either that or he had a heart attack from all the excitement.

  42. Re:another solution to an already solved problem.. by Michael+Kristopeit · · Score: 1
    how about you read my comment IN THIS THREAD, POSTED BEFORE YOURS that includes me clearly explaining the concept of Anonymous Cowards, and proving my understanding.

    then you sweep in like a donkey with claims that that didn't just happen...

    you are the worst kind of stupid.

    if i respond to an AC, and an AC responds back, i will assume it was the original AC until told otherwise. the burden of proof is not on my shoulders... i've already proven i'm the same person. you can't prove you aren't the same AC.

    you are NOTHING.

  43. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    And here I was, about to delete my bookmark to this. You truly are an endless source of amusement.

    There's thousands, potentially millions of ACs, and you assume that the same guy that launches a one-off comment will keep coming back to every comment he makes to see if you've answered? Dude, the only reason I've bookmarked this is that your incredibly arrogant attitude provides me a measure of amusement. I really couldn't be bothered to bookmark and check every anonymous comment I make.

    You appear to severely overestimate your own importance. Carry on entertaining me, minion.

  44. Re:another solution to an already solved problem.. by Michael+Kristopeit · · Score: 1
    i have never once said i was important.

    you are NOTHING.

  45. Re:another solution to an already solved problem.. by vegiVamp · · Score: 1

    One small sentence and your key signature ? How disappointing. I guess your batteries are running down, by now - too much screaming will make you hoarse.

    Shame, I'll have to find another source of amusement now.

    --
    What a depressingly stupid machine.
  46. Re:another solution to an already solved problem.. by Anonymous Coward · · Score: 0

    ur mums face works for me