{"id":1302,"date":"2017-05-02T22:29:35","date_gmt":"2017-05-02T22:29:35","guid":{"rendered":"https:\/\/w2.cleardb.net\/?p=1302"},"modified":"2022-09-22T14:41:42","modified_gmt":"2022-09-22T14:41:42","slug":"best-practices-for-monitoring-and-measuring-data-center-performance","status":"publish","type":"post","link":"https:\/\/www.navisite.com\/blog\/best-practices-for-monitoring-and-measuring-data-center-performance\/","title":{"rendered":"Best Practices for Monitoring and Measuring Data Center Performance"},"content":{"rendered":"
IT professionals are acutely aware just how closely tied their data center infrastructure performance is to their business performance in our digitally-driven world. Technology consumers \u2013 employees, suppliers, customers, and prospects<\/em> \u2013 expect highly available, fast and responsive interactions from all the systems they touch. As a result, IT professionals have critical roles in empowering the strategic success and tactical effectiveness of many businesses today. Accordingly, it is vital for IT to know what hardware and software metrics to monitor, and to understand how these metrics relate to each other. This enables IT to continuously optimize the infrastructure that empowers businesses to achieve their goals and objectives. The challenge in monitoring these metrics is that the performance of multiple infrastructure components is interrelated. Network capacities and speed, the number of cores and power of CPU\u2019s, the efficiency of application code, the levels of contentions for shared computing resources; and the various configurations of hypervisors, databases, and other computing services can all impact performance capabilities. As a result, focusing on just one layer of the data center infrastructure stack without considering the multi-dimensional impact it has on the others, can negate the effectiveness of performance solutions and tuning strategies. Accordingly, multiple metrics are monitored from each group. The objective is to uncover any impediments to the efficient AND effective utilization of various physical infrastructure resources in the data center. Monitoring tools look for specific workloads that are:<\/p>\n Furthermore, good monitoring tools also measure \u201cload average.\u201d Load average determines whether a physical server is in full use, not loaded (idle), highly loaded, or unusable due to overwhelming workloads. In Linux systems, this is done by examining run-queue utilization averaged over time. The run-queue lists processes waiting for resources to become available. The best monitoring tools identify which processes are in the run-queue and what they are waiting for. It should be noted as well, servers that are idling can identify data center performance problems just as much as highly loaded and unusable servers. Idling servers can be symptomatic of network saturations, poor load balancing, and thread locks or deadlocks.
\nIn addition to knowing what metrics to monitor, cloud administrators often conduct before\/after and A\/B tests on pre-optimized resources to compare these metrics with metrics from production infrastructure. These tests measure the effectiveness of tuning strategies and performance solutions. In public clouds, it is simple and cost-effective to provision such testing resources.
\nThe tests and metrics used to monitor the productivity of IT infrastructure are generally grouped into three categories; quantity measures, quality measures, and responsiveness measures. These groups are applied to every layer of the IT infrastructure stack; from operating systems, CPUs, storage tiers, and networks; to the efficiency and effectiveness of application code, computing services, and databases.<\/p>\n\n
\nTherefore, it is very helpful to use application and system monitoring tools to stay ahead of potential issues. These tools provide alerts to application and hardware problems, often before they are noticed by end users. Lists of various monitoring tools can be found here<\/a> and here<\/a>.
\nSo, what are these tools measuring and monitoring?
\nAs you know, computer systems have several types of physical resources \u2013 CPU, volatile memory, network, and persistent storage<\/em> \u2013 which collectively affect data center performance. Those resources also impact application performance as well. And, it is the level of application performance that determines how the data center is judged in achieving its strategic performance goals and objectives. \u2026a data center with low operating costs and efficient power usage is still considered a failure if it cannot protect its data or meet its applications\u2019 quantity, quality and responsiveness targets<\/em>\u2026
\nConsequently, monitoring tools continually measure the data center\u2019s:<\/p>\n\n
\n
\nMonitoring tools can only go so far, however. Application troubleshooting and profiling tools need to be used to help identify causes of performance problems. As an example, a profiling tool, like JProfiler, can check for Java methods that use lots of CPU resources, and it can determine how much time a Java application is spending on Garbage Collection. Some tools also provide the details of transactions within an application server, pinpointing, for instance, the SQL queries that are taking too much time to execute; or identifying which methods in a Java class are slowing down applications.
\nOnce problem processes and applications are identified, it\u2019s time to dig into these workloads to determine exactly how they are negatively impacting performance so fixes can be made. A list of common problems and potential solutions can be found here<\/a>. Additionally, some good, quick tuning strategies can be found here<\/a> if short-term, temporary repairs will suffice while long-term solutions are developed and deployed.
\nAs noted, numerous factors impact data center performance. Therefore, IT organizations are constantly looking for the ways to proactively identify and respond to problems. Accordingly, knowing what to monitor and understanding how to improve the data center\u2019s performance is critical. Because in today\u2019s world, IT\u2019s effectiveness is measured by how much they empower the strategic and tactical success of the businesses they support.<\/p>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":114,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[76,54,77,78,79,80],"acf":[],"yoast_head":"\n