What Is Performance?

12.2 What Is Performance?

Performance metrics are typically expressed in response time (R), throughput (X), or utilization (U ) of a system, depending on the load to be processed. The response time is generally defined as the time between sending a request to the system and receiving an answer from that system. The throughput states how many requests can be served per time unit, and the utilization tells us the percentage in which the system was busy.

Taking an initial (simplified) look, we can describe the work load by the number of requests that arrive at the system per time unit, assuming a linear relationship between work load and throughput, defined as the number of requests served per time unit. Though the actual throughput value increases linearly when the work load is low (see Figure 12-1), it then approaches a

12.2 What Is Performance? 249

Theoretical maximum capacity

Nominal system capacity

Throughput

Load

Response time

SLA

Load

Figure 12-1 Typical curve of throughput and response time as functions of the workload.

defined upper limit asymptotically. The upper limit is determined by the system bottleneck (see section 12.6 for a calculation of this threshold value), and can even drop again when the system gets overloaded. If we want to maximize the throughput to serve as many requests per time unit as possible, then a strategy that takes the system as close as possible to the theoretical maximum capacity would be optimal. However, we have to consider that the response time increases in line with rising throughput, which means that in general it is not a good strategy to maximize the throughput for an interactive information system. This is the reason why response time is normally also used as a performance metric for Web applications. A typical formulation of the performance, for example, like the formulations used in some Web benchmarks (e.g., SpecWeb; see section 12.6.3) states the maximum throughput, defined by the HTTP requests served within

a given time interval, during which a given threshold value for the response time is not exceeded. This threshold value is known as the service level agreement (SLA), and the pertaining throughput value is referred to as the nominal system capacity.

Performance analysis is aimed at determining the relationship between performance indicators (response time, throughput, and utilization), the work load, and the performance-relevant

250 Performance of Web Applications characteristics of the system under study. It is also aimed at discovering performance problems

or bottlenecks, and to set performance-improving measures. We can use measuring or modeling methods to analyze performance. When using measuring techniques, of course, we need a way to access an existing system (see Figure 12-2).

System

exists and is exists at least as an accessible

abstraction

Measurement Modeling

Monitoring Analytical

Intuition, trend, interpolation, experience

Benchmarks Simulation

Figure 12-2 Overview of methods for performance evaluation.

Modeling methods don’t necessarily require an existing system for analysis; they abstract the performance-relevant characteristics from a system description and map them to a mod- el. Depending on that model’s complexity, we can additionally use analytical processes or simulations to determine the performance metrics.

The following discussion looks at the special aspects in analyzing the performance of Web applications.