The Database Detective – The Performance Questionnaire
Solving performance issues often requires more than just hopping right into the database diagnostic tools to review performance charts and graphs. Some performance problems can certainly be solved by immediately looking at a particular set of performance metrics, but many issues require a further definition of the problem before investigation begins.
The intent of this article is not to have the database detective ask each and every question verbatim, but to take a structured approach to the information collection process. The purpose of these questions is to define the problem and narrow the scope of the issue. Narrowing the scope is the key to quick performance problem resolution. The DBA can then focus on a specific issue rather than taking a shotgun approach to performance problem analysis.
It assumes the DBA knows very little about the application. It is obvious that DBAs who have supported any application for a length of time will inherently know the answer to many of the upcoming questions.
- What is the nature of the application? What does it do?
- Is the application a third-party product or internally built?
- If the application is from a third-party, who is the vendor?
- How many users does the application support?
- Does the application permit users to access the data using AD-HOC query tools?
- Does the application have any batch processing work executed?
- Are users allowed to schedule AD-HOC batch work in the system?
Identifying the Problem
The investigation continues by attempting to narrow the scope of the problem. It is obvious that the answers to one question may answer several others. This will allow the DBA to limit their upcoming technical investigation to the specific problem.
- What is the nature of the problem? (i.e. entire application slow, group of transactions, specific transaction, individual screen)
- Are all users experiencing the problem or is it localized to a single user, group of users or a geographical business site?
- What is the user’s evaluation of the performance issue? Is it 2 times as slow, 3 times as slow?
- What impact is the performance problem having on the users?
- Which performance objectives are not being met?
- Can the problem be recreated on demand?
Establishing a Timeline
Identifying when the performance issue began to occur is critical to the investigation process. The DBA will be able to use this information to determine if any business and/or technical changes were made to the environment. These changes are often the root cause of the performance problem (see section on identifying changes below).
It is also important to determine if the problem can be associated with a particular time of day, day of week or month. Answers to these questions will allow the DBA to investigate possible workload contention issues (see section on workloads below).
- When did the problem start occurring?
- Has the problem existed from the time the system went into production or has it occurred after implementation?
- Does the problem occur intermittently or is performance poor all of the time?
- Does the problem occur during specific times of the month? (i.e. month-end, beginning of the month, mid-month)
- Does the problem occur during specific calendar days (i.e. Monday, Friday)?
- Is the issue more prevalent at particular times during the day?
- Is the problem getting worse? If so, is it daily, weekly?
Identifying Changes
identifying the exact date when performance began to suffer allows the DBA to determine if anything else changed at the same time. Many of the database performance monitoring tools allow users to review performance metrics historically, but answer to the questions below will certainly speed the investigation process.
- Did anything change from a business perspective (adding more users, increasing data volumes)?
- Did anything change from a technical perspective (software upgrades/patches, new application code changes or new code implemented, database moved to another server, adding databases to server, hardware changes)?
Are Workloads Causing the Issue?
Answers from the previous questions, that attempt to identify the particular time of day, day of week or month that the performance problem occurs, allow the DBA to draw correlations between workloads and performance. Workload increases and workload contention are common causes of performance issues.
- Are there databases used by other applications on the server in question?
- Are there any application batch jobs running when the problems occur?
- Are there any system jobs running when the problems occur (i.e. maintenance jobs, etc)?
- Is there an increase in the number of online users during the times when performance is an issue? Are more people using the system at that time?
Application Technology
As stated previously, if the DBA has supported the application for any length of time, this information should be obvious. If the entire application is slow, information on the architecture and individual server environment becomes very important to the investigation. Proper sizing of the various components is a requirement for good application performance.
- Identify the application server configurations (number of app servers, web servers, load balancers and hardware configuration of each including CPUs, memory disk subsystems)
- Identify the database ecosystem (hardware, operating system, database):
- Hardware vendor, model and configuration (CPUs, memory)
- Disk subsystem (local, SAN, NAS, storage array vendor)
- Operating system version and patches
- Database version and patches
- Highly available configuration
Conclusion
In my next article, we’ll use the information provided by these questions to focus our technical investigation. We’ll run through some common scenarios and review some of the methods used to further narrow the scope of the analysis process.
Thanks for Reading,
Chris Foot
Oracle Ace
Director Of Customer Value & Service Delivery
