3.2 Having Multiple Processes in the System
Multiprogramming means that multiple processes are handled by the system at the same time, typically by time slicing. It is motivated by considerations of responsive-
ness to the users and utilization of the hardware.
Note: terminology may be confusing
“Job” and “process” are essentially synonyms. The following terms actually have slightly different meanings:
Multitasking — having multiple processes time slice on the same processor.
Multiprogramming — having multiple jobs in the system either on the same processor,
or on different processors
Multiprocessing — using multiple processors for the same job or system i.e. parallel
computing. When there is only one CPU, multitasking and multiprogramming are the same thing.
In a parallel system or cluster, you can have multiprogramming without multitasking, by running jobs on different CPUs.
3.2.1 Multiprogramming and Responsiveness
One reason for multiprogramming is to improve responsiveness, which means that users will have to wait less on average for their jobs to complete.
with FCFS, short jobs may be stuck behind long ones
Consider the following system, in which jobs are serviced in the order they arrive First Come First Serve, or FCFS:
waits job waits
job 1 arrives
job3 arrives
time job1 runs
job2 runs job3
job 1 job2
job3 terminates terminates
arrives job2
terminates
Job 2 is ahead of job 3 in the queue, so when job 1 terminates, job 2 runs. However, job 2 is very long, so job 3 must wait a long time in the queue, even though it itself is
short.
59
If the CPU was shared, this wouldn’t happen
Now consider an ideal system that supports processor sharing: when there are k jobs
in the system, they all run simultaneously, but at a rate of 1k.
job 1 arrives
job2 arrives
job3 arrives
time job1
job3 job2
job2 terminates
job 1 terminates
terminates job3
Now job 3 does not have to wait for job 2. The time it takes is proportional to its own length, increased according to the current load.
Regrettably, it is impossible to implement this ideal. But we’ll see below that it can be approximated by using time slicing.
Responsiveness is important to keep users happy
Users of early computer systems didn’t expect good responsiveness: they submitted a job to the operator, and came to get the printout the next day. But when interactive
systems were introduced, users got angry when they had to wait just a few minutes. Actually good responsiveness for interactive work e.g. text editing is measured in
fractions of a second.
Supporting interactive work is important because it improves productivity. A user can submit a job and get a response while it is “still in his head”. It is then possible
to make modifications and repeat the cycle.
To read more: The effect of responsiveness on users’ anxiety was studied by Guynes, who showed that bad responsiveness is very annoying even for people who are normally very re-
laxed [5].
Actually, it depends on workload statistics
The examples shown above had a short job stuck behind a long job. Is this really a common case?
Consider a counter example, in which all jobs have the same length. In this case, a job that arrives first and starts running will also terminate before a job that arrives
later. Therefore preempting the running job in order to run the new job delays it and degrades
responsiveness. Exercise 48 Consider applications you run daily. Do they all have similar runtimes,
or are some short and some long?
60
The way to go depends on the coefficient of variation CV of the distribution of job runtimes. The coefficient of variation is the standard deviation divided by the mean.
This is a sort of normalized version of the standard deviation, and measures how wide the distribution is. “Narrow” distributions have a small CV, while very wide or fat
tailed distributions have a large CV. The exponential distribution has CV = 1.
Returning to jobs in computer systems, if the CV is smaller than 1 than we can expect new jobs to be similar to the running job. In this case it is best to leave the
running job alone and schedule additional jobs FCFS. If the CV is larger than 1, on the other hand, then we can expect new jobs to be shorter than the current job. Therefore
it is best to preempt the current job and run the new job instead.
Measurements from several different systems show that the distribution of job runtimes is heavy tailed. There are many very short jobs, some “middle” jobs, and
few long jobs, but some of the long jobs are very long. The CV is always larger than 1 values from about 3 to about 70 have been reported. Therefore responsiveness is
improved by using preemption and time slicing, and the above examples are correct.
To read more: The benefit of using preemption when the CV of service times is greater than 1 was established by Regis [13].
Details: the distribution of job runtimes
There is surprisingly little published data about real measurements of job runtimes and their distributions. Given the observation that the CV should be greater than 1, a com-
mon procedure is to choose a simple distribution that matches the first two moments, and thus has the correct mean and CV. The chosen distribution is usually a two-stage
hyper-exponential, i.e. the probabilistic combination of two exponentials. However, this procedure fails to actually create a distribution with the right shape, and might lead to
erroneous performance evaluations, as demonstrated by Lazowska [9].
An interesting model for interactive systems was given by Leland and Ott [10], and later verified by Harchol-Balter and Downey [7]. This model holds for processes that are longer
than a couple of seconds, on Unix systems. For such processes, the observed distribution is
Prr t = 1t where
r denotes the process runtime. In other words, the tail of the distribution of runtimes has a Pareto distribution.
3.2.2 Multiprogramming and Utilization