Slashdot Mirror


User: John.Miecielica

John.Miecielica's activity in the archive.

Stories
0
Comments
1
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 1

  1. Planning Tips on Ask Slashdot: Capacity Planning and Performance Management? · · Score: 1

    I ran the Capacity Management practice for a leading provider of financial data servers for 10 years. We had a dedicated team doing the capacity planning for the company. However, a key part of my practice was making the performance data available to the application analysts as well as the centralized capacity planning team. By fostering this partnership with the application teams, we were able to understand exactly what was going on with each system and develop capacity plans togethger. We settled on the TeamQuest set of products (for full disclosure, I am now working for TeamQuest as a Product Manager). There are many tools out there that can you help with with some of the initial tasks like gathering performance metrics in order to move from Chaotic to Reactive. However, as you move up the maturity curve to proactive, service and value, the available technology really starts to thin out. Some of the key elements you want to pay attention to are: - Data Granularity. Some tools only go to the 15 minute level. TeamQuest can collect data down to 1 sec intervals which some of our customers find to be essential. For many customers 1 minute or 5 minute granularity is sufficient. You will also want to view performance data at the process level to find the culprit(s) for high CPU utilization (for example). - Problem Resolution. You will want an easy to use flexible interface to view the performance data in fine granularity to assist in problem determination. Automated correlation analysis is a huge plus in this area as the problem you are looking at may actually be a victim of something else going on in your infrastructure - Prediction. Performance of computing systems, unfortunately, is not linear. So simple linear trending will only take you so far. You will need more sophisticated modeling technology to really understand when response time and throughput is going to suffer. - Guidance, You need a set of analytic tools which tell you exactly what will cause performance issues and when & understand what it will take to prevent the problems from occurring in the first place.