Multiprogramming and Responsiveness Having Multiple Processes in the System

3.2 Having Multiple Processes in the System

Multiprogramming means that multiple processes are handled by the system at the same time, typically by time slicing. It is motivated by considerations of responsive- ness to the users and utilization of the hardware. Note: terminology may be confusing “Job” and “process” are essentially synonyms. The following terms actually have slightly different meanings: Multitasking — having multiple processes time slice on the same processor. Multiprogramming — having multiple jobs in the system either on the same processor, or on different processors Multiprocessing — using multiple processors for the same job or system i.e. parallel computing. When there is only one CPU, multitasking and multiprogramming are the same thing. In a parallel system or cluster, you can have multiprogramming without multitasking, by running jobs on different CPUs.

3.2.1 Multiprogramming and Responsiveness

One reason for multiprogramming is to improve responsiveness, which means that users will have to wait less on average for their jobs to complete. with FCFS, short jobs may be stuck behind long ones Consider the following system, in which jobs are serviced in the order they arrive First Come First Serve, or FCFS: waits job waits job 1 arrives job3 arrives time job1 runs job2 runs job3 job 1 job2 job3 terminates terminates arrives job2 terminates Job 2 is ahead of job 3 in the queue, so when job 1 terminates, job 2 runs. However, job 2 is very long, so job 3 must wait a long time in the queue, even though it itself is short. 59 If the CPU was shared, this wouldn’t happen Now consider an ideal system that supports processor sharing: when there are k jobs in the system, they all run simultaneously, but at a rate of 1k. job 1 arrives job2 arrives job3 arrives time job1 job3 job2 job2 terminates job 1 terminates terminates job3 Now job 3 does not have to wait for job 2. The time it takes is proportional to its own length, increased according to the current load. Regrettably, it is impossible to implement this ideal. But we’ll see below that it can be approximated by using time slicing. Responsiveness is important to keep users happy Users of early computer systems didn’t expect good responsiveness: they submitted a job to the operator, and came to get the printout the next day. But when interactive systems were introduced, users got angry when they had to wait just a few minutes. Actually good responsiveness for interactive work e.g. text editing is measured in fractions of a second. Supporting interactive work is important because it improves productivity. A user can submit a job and get a response while it is “still in his head”. It is then possible to make modifications and repeat the cycle. To read more: The effect of responsiveness on users’ anxiety was studied by Guynes, who showed that bad responsiveness is very annoying even for people who are normally very re- laxed [5]. Actually, it depends on workload statistics The examples shown above had a short job stuck behind a long job. Is this really a common case? Consider a counter example, in which all jobs have the same length. In this case, a job that arrives first and starts running will also terminate before a job that arrives later. Therefore preempting the running job in order to run the new job delays it and degrades responsiveness. Exercise 48 Consider applications you run daily. Do they all have similar runtimes, or are some short and some long? 60 The way to go depends on the coefficient of variation CV of the distribution of job runtimes. The coefficient of variation is the standard deviation divided by the mean. This is a sort of normalized version of the standard deviation, and measures how wide the distribution is. “Narrow” distributions have a small CV, while very wide or fat tailed distributions have a large CV. The exponential distribution has CV = 1. Returning to jobs in computer systems, if the CV is smaller than 1 than we can expect new jobs to be similar to the running job. In this case it is best to leave the running job alone and schedule additional jobs FCFS. If the CV is larger than 1, on the other hand, then we can expect new jobs to be shorter than the current job. Therefore it is best to preempt the current job and run the new job instead. Measurements from several different systems show that the distribution of job runtimes is heavy tailed. There are many very short jobs, some “middle” jobs, and few long jobs, but some of the long jobs are very long. The CV is always larger than 1 values from about 3 to about 70 have been reported. Therefore responsiveness is improved by using preemption and time slicing, and the above examples are correct. To read more: The benefit of using preemption when the CV of service times is greater than 1 was established by Regis [13]. Details: the distribution of job runtimes There is surprisingly little published data about real measurements of job runtimes and their distributions. Given the observation that the CV should be greater than 1, a com- mon procedure is to choose a simple distribution that matches the first two moments, and thus has the correct mean and CV. The chosen distribution is usually a two-stage hyper-exponential, i.e. the probabilistic combination of two exponentials. However, this procedure fails to actually create a distribution with the right shape, and might lead to erroneous performance evaluations, as demonstrated by Lazowska [9]. An interesting model for interactive systems was given by Leland and Ott [10], and later verified by Harchol-Balter and Downey [7]. This model holds for processes that are longer than a couple of seconds, on Unix systems. For such processes, the observed distribution is Prr t = 1t where r denotes the process runtime. In other words, the tail of the distribution of runtimes has a Pareto distribution.

3.2.2 Multiprogramming and Utilization