MODEL VERIFICATION VIA PERFORMANCE ANALYSIS
8.2 MODEL VERIFICATION VIA PERFORMANCE ANALYSIS
At some point in the modeling enterprise, the modeler should conduct more sophis- ticated consistency analysis beyond sanity checks. The vast majority of Arena-based simulation models consist of a queueing component, and such models call for a
performance analysis study of a queueing system, which involves the computation of appropriate performance measures and verification of certain relations among them. Due to the pervasiveness of queueing models in simulation modeling, the analyst needs to acquire a modicum of knowledge of queueing theory. This section overviews some elements of performance analysis of queueing systems and associated queueing theory that support model verification.
8.2.1 G ENERIC W ORKSTATION AS A Q UEUEING S YSTEM
The queueing-modeling paradigm plays a key role in the analysis and design of manufacturing systems, telecommunications systems, transportation, and other areas. Queueing theory is a well-developed field that has produced an enormous body of research on queueing systems and queueing networks. The reader is referred to Gross
144 Model Goodness: Verification and Validation
WIP inventory Processor
Job Job Arrivals
Departures
Figure 8.1
A simple workstation.
and Harris (1974), Cooper (1990), and Kleinrock (1975), to name a few. Here, we review briefly the formulation and analysis of queueing systems commonly used to model manufacturing and service systems.
Consider a generic queueing example from the domain of manufacturing. Recall that
a workstation on the factory floor is a location where work is carried out by a server (machines or humans), possibly subject to random failures. Figure 8.1 depicts a schematic workstation, where jobs arrive to be processed or machined. Excess jobs are kept in a buffer (work-in-process, abbreviated as WIP) until their processing can begin. Upon completion, jobs depart for another destination (possibly another workstation).
When studying workstations in manufacturing environments, we routinely model them as queueing systems. The flow of incoming jobs, either singly or in batches, forms the arrival stream. The time that a job is delayed for processing at the workstation server is its service time. If, however, an incoming job cannot start processing right away (because the server or servers are busy), then it is held in a buffer, and in due time will be removed from the buffer and assigned a server. In a workstation model, the buffer capacity may be finite or infinite, although in real-life workstations it is, of course, finite. A job that finds the buffer full on arrival is usually rerouted to another workstation or simply assumed to be lost. The order in which jobs queue up in the buffer is called a queueing discipline. A workstation model may have additional wrinkles, such as server failure and repairs, routinely modeled via random uptimes and downtimes.
8.2.2 Q UEUEING P ROCESSES AND P ARAMETERS
A waiting line necessarily develops in every service-providing facility, whenever it cannot momentarily cope with its current workload (brought in by customers, jobs, demands, or more generally, transactions, all of which will be used interchangeably).
A queueing system that models a service facility is mathematically characterized by its customer arrival process, service process, number of servers, service discipline, and queue (buffer) capacity. The arrival process is routinely defined in terms of the probability law governing the time intervals between consecutive arrivals (called inter- arrival times). Accordingly, let A i
be the interarrival time between job i – 1 and job i. It is often assumed that the A i are iid (independent, identically distributed) random
variables. The arrival rate, denoted by l, is the expected number of job arrivals per unit time, and is given by l ¼ 1=E½A i for all i, that is, the arrival rate is the reciprocal of the expected interarrival time. The service time is the time that the server devotes to a particular job. Let X i
be the service time of job i. Again, it is often assumed that the X i are iid random variables with service rate m ¼ 1=E½X i for all i, that is, the service rate is the reciprocal of the expected service time.
Model Goodness: Verification and Validation 145
8.2.3 S ERVICE D ISCIPLINES
In a multiserver facility, the servers are also often assumed to be statistically independ- ent and identical, with a common service time distribution (iid servers). The customer waiting room (buffer) usually has finite capacity; however, a very large buffer may be considered infinite for modeling purposes. When more than one server is available for service, a server is usually selected at random with equal probabilities. Recall that the order in which jobs are moved from the buffer to seize a server and start service is the service discipline or queueing discipline, the most common of which are listed below:
FIFO ( first in, first out), also known as FCFS ( first come, first served ), is the most common discipline. Here, jobs are served in their order of arrival. LIFO (last in, first out), also known as LCFS (last come, first served), serves the most recent arrival first. SIRO (service in random order) selects the next job in the buffer randomly, each with equal probability. RR (round robin) is associated with a (fixed) service time, often referred to as the time quantum. Jobs are served cyclically, one quantum at a time, until attaining their requisite service time. PS (priority service) assumes that jobs have priorities associated with them, and selects
a job with the highest service priority. If several are present, then a secondary discipline may be used to select among highest-priority jobs, such as FIFO, LIFO, SIRO, and so on. Jobs with the same priority are said to belong to the corresponding priority class.
A standard notation has been developed to specify a queueing system succinctly. It employs slash-separated symbols, representing an arrival process, a service process, the number of servers, and the queue capacity (understood to be infinity, when omitted), in this order. The symbol M (shorthand for Markov, see Section 3.9.4) stands for an exponential distribution, D for a deterministic value, and GI for a general distribution, all in the corresponding iid arrival or service process. Thus, M/M/1 specifies a queue with iid exponential distributions for interarrival and service times, a single server, and infinite buffer capacity; similarly, M/D/k/C specifies a queue with iid exponential interarrival times, deterministic service times, k servers, and a capacity C, of which the first k positions are occupied by servers.
8.2.4 Q UEUEING P ERFORMANCE M EASURES
A queueing system is studied to glean understanding of its behavior and to estimate its performance measures (metrics) of interest. Common performance measures follow:
Average number of jobs in the queue (buffer only) Average number of jobs in the system (buffer and servers) Average job waiting time (buffer-only delay) Average job sojourn time (buffer and service delays) Server utilization (fraction of time a server is busy) Throughput (output rate, namely, departure rate from the system)
One general way of evaluating these measures is to first compute analytically (if possible) or to estimate (by simulation) the steady-state probability distribution of the
146 Model Goodness: Verification and Validation number of jobs in the system. Thus, if the system has capacity K and P n denotes the
steady-state probability of having n jobs in the system, then the average number of N S , is given by
where K may be finite or infinite. In a single-server system, the average number of jobs in service is 1
P 0 (the probability that the system is not empty of customers), N q , becomes
(8 :2) In general, 1
P 0 ¼ r is the server utilization for any single-server queueing system with finite or infinite queue capacity. For infinite capacity systems, utilization can be expressed as the ratio of the input rate and the service rate, that is, r ¼ l=m. When r keep up with the incoming workload, and consequently the number of jobs in the system grows without bound in the long run. This situation is colorfully referred to as
“system explosion,” but in queueing theory terminology, the system is said to be unstable. For stability to hold and long-run measures to exist, we must have r < 1,
or equivalently, P 0 > 0. Clearly, a finite-capacity system is always stable, because the number of jobs in it is bounded by its capacity. Note that we are typically interested in stable systems in the context of simulation modeling.
8.2.5 R EGENERATIVE Q UEUEING S YSTEMS AND B USY C YCLES
The fundamental relation r ¼ l=m can be justified more rigorously using the fol- lowing so-called regeneration argument for the GI/GI/1 queue, that is, for iid inter- arrival times and iid service times. Any such stable queueing system goes through an endless sequence of cycles, called busy cycles. Figure 8.2 illustrates busy cycles in
a queue. Each busy cycle consists of an idle period (during which the queue is empty and the server is idle), followed by a busy period (during which the queue is not empty and the server is busy serving jobs). When the system starts an idle period, the next event will definitely be a job arrival, and the (random) time until this event occurs is independent of the history of the workstation up until that point. Thus, in the GI/GI/1 queue, the system history regenerates itself (statistically) at time points inaugurating idle periods, in the sense that system histories over distinct regeneration cycles
Number in the System
Empty-and-Idle
Idle
Time Empty-and-Idle
Busy
Figure 8.2 Regeneration cycles consisting of successive idle and busy periods.
Model Goodness: Verification and Validation 147 (intervals consisting of successive idle and busy periods) are iid. The corresponding
stochastic processes are therefore called regenerative processes or renewal processes (see Section 3.9.3). For example, the stochastic process of the number of jobs in the system is regenerative (refer again to Figure 8.2).
Let T B be the length of the busy period, and let T C be the length of the regeneration cycle. In a system with infinite capacity, all jobs arriving during a regeneration cycle are served during the associated busy period. Thus, the number of incoming and out- going jobs over a regeneration cycle (as well as corresponding averages) must coincide, that is,
l E ½T C B
resulting in the relation
The right side in Eq. 8.3 is precisely the fraction of time the server is busy over a regeneration cycle, that is, r ¼ l=m ¼ Pr{server is busy}.
The length of the busy period and the regeneration cycle are also important measures of interest. When interarrival times are Expo(l), the idle time in the regeneration cycle is also Expo(l), because it is the residual interarrival time (from the time instant the server goes idle). Given the memoryless property of the exponential distribution (see Section 3.8.4), the following expression can be written:
1 =l
Pr{server is idle} ¼1
E ½T C
which readily yields
From Eqs. 8.3 and 8.5, we obtain
where X is the service time with E ½X ¼ 1=m. The quantity E½T B merits further elaboration. Each job inaugurating a busy period arrives at an empty system. However, new jobs (the “offspring” of the first job) may arrive during its service time, and each of
those may have its own “offspring.” Thus, E½T B can be viewed as the expected time to serve the “family” of jobs “spawned” by the first job in the busy cycle. A similar analysis of servers subject to various failures can be found in Altiok (1997).
8.2.6 T HROUGHPUT
The throughput (output or departure rate) of a queueing system can be expressed in terms of utilization. The server works at the rate of m jobs per unit time (while busy), yielding an average throughput of
148 Model Goodness: Verification and Validation
(8 :7) However, for queues with infinite capacity, Eq. 8.7 reduces to l due to long-run job
flow conservation in stable queueing systems (see also Eq. 8.3).
8.2.7 L ITTLE ’ S F ORMULA
Consider any steady-state storage system, where units arrive, spend some time in the system, and eventually depart. A general steady-state relation of great practical importance is Little's formula,
N ¼l W
(8 :8) W is the average time a job
spends in the system (mean flow time). In particular, Little’s formula holds for any queueing subsystem (one should, however, draw the boundaries of the subsystem under
N S ¼l W S ; N b ¼l W b . Interestingly, Little’s formula is valid for any queue capacity. Thus, for a queue with finite capacity (and therefore with a possible loss stream), the arrival rate, l, in Eq. 8.8 is simply the effective arrival rate (the expected rate of arrivals that manage to enter the system, excluding any lost jobs), and not the offered arrival rate (which includes lost jobs).
In effect, Little’s formula expresses a conservation law, although this is not immedi- ately evident from Eq. 8.8. While a formal proof is fairly intricate, a heuristic plausibil-
ity argument (more accurately, an interpretation of Eq. 8.8) can be advanced as follows. Consider some target customers traversing the system. Then, the right-hand side of Eq. 8.8 is the expected number of customers that arrive while the target customer sojourns in the system, while the left-hand side of Eq. 8.8 is the expected number of customers that accumulate in the system. Equation 8.8 asserts that these expectations are equal.
Little’s formula is an important tool for verifying queueing simulations due to its simplicity and generality. It may be used to discover modeling errors or coding bugs,
where customers are erroneously “marooned” in the system, or are inadvertently created or destroyed.
8.2.8 S TEADY -S TATE F LOW C ONSERVATION
A widely used conservation relation for single-server queueing systems with finite capacity (buffer size plus service positions), K, is
(8 :9) where p K is the steady-state probability that an arriving job finds the buffer full on
l (1
arrival, and P 0 is the steady-state probability that the system is empty (see Gross and Harris [1998]). Thus, Eq. 8.9 is a flow conservation equation stating that in steady state, the effective arrival rate into the system equals the departure rate from the system (throughput). Note carefully that probabilities of the form p K (steady-state probability of
a job finding k jobs already in the system on arrival) and P k (steady-state probability of k jobs in the system) are distinct quantities with the following operational meaning:
Model Goodness: Verification and Validation 149 p K is the long-run fraction of jobs that find (on arrival) k jobs in the system. Such
quantities are referred to as customer-average probabilities or arrival point probabil- ities. Note carefully that the number of jobs in the system is computed only at the (random) times of arrival and excludes the arriving job. For this reason, this type of system state is referred to as the state embedded at arrival times, or as the state seen by arriving jobs. P k is the long-run fraction of time that the system has precisely k jobs in it. Such quantities are referred to as time average probabilities or arbitrary time probabilities.
8.2.9 PASTA P ROPERTY
Generally, customer averages and time averages are not equal (embedding the state at random times can introduce a “bias” into a customer average as compared to the
corresponding time average). While generally, p k 6¼ P k , there are cases when they are equal (Melamed and Whitt [1990a,b], Melamed and Yao [1995]). The most common case where the equality
k ¼P k
holds is known as PASTA (Poisson Arrivals See Time Averages)—in other words, when the arrival stream is a Poisson process. Intuitively, this can be attributed to the
memoryless property of the iid exponentially distributed interarrival times (see Section 3.8.4), or equivalently, to the independent increments of the Poisson process (see Section 3.9.2).