A possible solution to such problems is to use several operating system threads as the basis for a user-level thread package. Thus when a user-level thread performs an
IO, only the operating system thread currently running it is blocked, and the other operating system threads can still be used to execute the other user-level threads.
Exploiting multiprocessors requires operating system threads
A special case where threads are useful is when running on a multiprocessor a com- puter with several physical processors. In this case, the different threads may exe-
cute simultaneously on different processors. This leads to a possible speedup of the computation due to the use of parallelism. Naturally, such parallelism will only arise
if operating system threads are used. User-level threads that are multiplexed on a single operating system process cannot use more than one processor at a time.
The following table summarizes the properties of kernel threads and user threads, and contrasts them with processes:
processes kernel threads
user threads protected
from each
other, require operating system to communicate
share address space, simple communication, useful for application structuring
high overhead: all oper- ations require a kernel
trap, significant work medium overhead: oper-
ations require a kernel trap, but little work
low overhead: everything is done at user level
independent: if one blocks, this does not affect the others
if a thread blocks the whole process is blocked
can run on different processors in a multiprocessor all share the same pro-
cessor system specific API, programs are not portable
the same thread library may be available on sev-
eral systems
one size fits all application-specific
thread management is possible
In the following, our discussion of processes is generally applicable to threads as well. In particular, the scheduling of threads can use the same policies described
below for processes.
3.1.4 Operations on Processes and Threads
As noted above, a process is an abstraction of the computer, and a thread is an ab- straction of the CPU. What operations are typically available on these abstractions?
51
Create a new one
The main operation on processes and threads is to create a new one. In different systems this may be called a
fork or a spawn, of just simply create. A new process is typically created with one thread. That thread can then create additional threads
within that same process. Note that operating systems that support threads have distinct system calls for
processes and threads. For example, the “process create” call can be used to create a new process, and then “thread create” can be used to add threads to this process.
Terminate an existing one
The dual of creating a process is terminating it. A process or thread can terminate itself by returning from its main function, or by calling the
exit system call. Exercise 41 If a multithreaded process terminates, what happens to its threads?
Allowing one process to terminate another is problematic — what if the other process belongs to another user who does not want his process to be terminated? The
more common interface is to allow one process to send a signal to another, as described below.
Threads within the same process are less restricted, as it is assumed that if one terminates another this is part of what the application as a whole is supposed to do.
Suspend execution
A thread embodies the flow of a computation. So a desirable operation on it may be to stop this computation.
A thread may suspend itself by going to sleep. This means that it tells the system that it has nothing to do now, and therefore should not run. A sleep is associated with
a time: when this future time arrives, the system will wake the thread up.
Exercise 42 Can you think of an example where this is useful? Threads in the same process can also suspend each other. Suspend is essentially
another state in the thread state transition graph, which is similar to the blocked state. The counterpart of suspend is to resume another thread. A resumed thread is
moved from the suspend state to the ready state.
Control over execution is sometimes also useful among processes. For example, a debugger process may control the execution of a process executing the application
being debugged.
52
Send a signal or message
A common operation among processes is the sending of signals. A signal is often described as a software interrupt: the receiving process receives the signal rather
than continuing with what it was doing before. In many cases, the signal terminates the process unless the process takes some action to prevent this.
Example: Processes and Threads in Unix
Unix processes are generally similar to the description given above. However, there are some interesting details.
To read more: A full discussion of Unix processes is given by Bach [1, Chap. 6], for Unix System V. The BSD version is described by McKusick and friends [11, Chap. 4].
The PCB is divided in two
The Unix equivalent of a PCB is the combination of two data structures. The data items that the kernel may need at any time are contained in the process’s entry in
the process table including priority information to decide when to schedule the pro- cess. The data items that are only needed when the process is currently running are
contained in the process’s u-area including the tables of file descriptors and signal handlers. The kernel is designed so that at any given moment the current process’s
u-area is mapped to the same memory addresses, and therefore the data there can be accessed uniformly without process-related indirection.
Exercise 43 Should information about the user be in the process table or the u-area? Hint: it’s in the process table. Why is this surprising? Can you imagine why is it there
anyway?
There are many states
The basic process state graph in Unix is slightly more complicated than the one in- troduced above, and looks like this:
53
blocked
event done created
running
kernel terminated
zombie
ready ready
user
kernel return
schedule trap or
interrupt
schedule event
wait for preempt
running
user
Note that the running state has been divided into two: running in user mode and in kernel mode. This is because Unix kernel routines typically run within the context of
the current user process, rather than having a separate environment for the kernel. The ready state is also illustrated as two states: one is for preempted processes that
will continue to run in user mode when scheduled, and the other is for processes that blocked in a system call and need to complete the system call in kernel mode
the implementation actually has only one joint ready queue, but processes in kernel mode have higher priority and will run first. The zombie state is for processes that
terminate, but are still kept in the system. This is done in case another process will later issue the
wait system call and check for their termination. We will enrich this graph with even more states in the future, when we discuss
swapping. Exercise 44 Why isn’t the blocked state divided into blocked in user mode and blocked
in kernel mode?
Exercise 45 The arrow from ready user to running user shown in this graph does not really exist in practice. Why?
The fork system call duplicates a process
In Unix, new processes are not created from scratch. Rather, any process can create a new process by duplicating itself. This is done by calling the
fork system call.. The new process will be identical to its parent process: it has the same data, executes the
same program, and in fact is at exactly the same place in the execution. The only differences are their process IDs and the return value from the
fork. Consider a process structured schematically as in the following figure:
54
pid: undef x: 1
y: 3 12: y = x + 2;
13: pid = fork; 14: if pid == 0 {
15: child 16: } else {
17: parent 18: }
11: x = 1;
79: frame for fork saved regs
ret to: 13 ret val: undef
previous frames pid = 758
ppid = 699
PC = 13 SP = 79
uid = 31 stack
data text
parent PCB
It has a process ID pid of 758, a user ID uid of 31, text, data, and stack segments, and so on the “pid” in the data segment is the name of the variable that is assigned
in instruction 13; the process ID is stored in the PCB. Its program counter PC is on instruction 13, the call to
fork. Calling
fork causes a trap to the operating system. From this point, the process is not running any more. The operating system is running, in the
fork function. This function examines the process and duplicates it.
First, fork allocates all the resources needed by the new process. This includes a
new PCB and memory for a copy of the address space including the stack. Then the contents of the parent PCB are copied into the new PCB, and the contents of the
parent address space are copied into the new address space. The text segment, with the program code, need not be copied. It is shared by both processes.
The result is two processes as shown here: 55
pid: undef x: 1
y: 3 pid: undef
x: 1 y: 3
12: y = x + 2; 13: pid = fork;
14: if pid == 0 { 15: child
16: } else { 17: parent
18: } 11: x = 1;
79: frame for fork saved regs
ret to: 13
previous frames ret val: 829
79: frame for fork saved regs
ret to: 13
previous frames ret val: 0
pid = 758 ppid = 699
SP = 79 uid = 31
PC = 52 text
data stack
SP = 79 ppid = 758
pid = 829 uid = 31
PC = 52 text
data stack
parent PCB child PCB
The new process has a process ID of 829, and a parent process ID ppid of 758 — as one might expect. The user ID and all other attributes are identical to the parent.
The address space is also an exact copy, except for the stack, where different return values are indicated: in the parent,
fork will return the process ID of the new child process, whereas in the child, it will return 0. When the processes are scheduled to
run, they will continue from the same place — the end of the fork, indicated by a PC
value of 52. When the fork function actually returns, the different return values will
be assigned to the variable pid, allowing the two processes to diverge and perform different computations.
Exercise 46 What is the state of the newly created process? Note that as the system completes the
fork, it is left with two ready processes: the parent and the child. These will be scheduled at the discretion of scheduler. In
principle, either may run before the other. To summarize,
fork is a very special system call: it is “a system call that returns twice” — in two separate processes. These processes typically branch on the return
value from the fork, and do different things.,
The exec system call replaces the program being executed
In many cases, the child process calls the exec system call, which replaces the pro-
gram that is being executed. This means 1. Replace the text segment with that of the new program.
56
2. Replace the data segment with that of the new program, initialized as directed by the compiler.
3. Re-initialize the heap. 4. Re-initialize the stack.
5. Point the program counter to the program’s entry point. When the process is subsequently scheduled to run, it will start executing the new
program. Thus exec is also a very special system call: it is “a system call that never
returns”, because if it succeeds the context in which it was called does not exist anymore.
Exercise 47 One of the few things that the new program should inherit from the old one is the environment the set of
hname, valuei pairs of environment variables and their values. How can this be done if the whole address space is re-initialized?
The environment can be modified between the fork and the exec
While exec replaces the program being run, it does not re-initialize the whole envi-
ronment. In particular, the new program inherits open files from its predecessor. This is used when setting up pipes, and is the reason for keeping
fork and exec separate. It is described in more detail in Section 12.2.3.
Originally, Unix did not support threads
Supporting threads in the operating system should be included in the operating sys- tem design from the outset. But the original Unix systems from the 1970s did not
have threads or in other words, each process had only one thread of execution. This caused various complications when threads were added to modern implementations.
For example, in Unix processes are created with the
fork system call, which dupli- cates the process which called it as described above. But with threads, the semantics
of this system call become unclear: should all the threads in the forked process be duplicated in the new one? Or maybe only the calling thread? Another example is the
practice of storing the error code of a failed system call in the global variable
errno. With threads, different threads may call different system calls at the same time, and
the error values will overwrite each other if a global variable is used.
Implementing user-level threads with setjmp and longjmp
The hardest problem in implementing threads is the need to switch among them. How is this done at user level?
If you think about it, all you really need is the ability to store and restore the CPU’s general-purpose registers, and to set the stack pointer to point into the correct
stack. This can actually be done with the appropriate assembler code you can’t do it
57
in a high-level language, because such languages typically don’t have a way to say you want to access the stack pointer. You don’t need to modify the special registers like
the PSW and those used for memory mapping, because they reflect shared state that is common to all the threads; thus you don’t need to run in kernel mode to perform
the thread context switch.
In Unix, jumping from one part of the program to another can be done using the setjmp and longjmp functions that encapsulate the required operations. setjmp es-
sentially stores the CPU state into a buffer. longjmp restores the state from a buffer
created with setjmp. The names derive from the following reasoning: setjmp sets
things up to enable you to jump back to exactly this place in the program. longjmp
performs a long jump to another location. To implement threads, assume each thread has its own buffer. Let’s assume that
bufA will be used to store the state of the current thread. The code that implements a context switch is then simply
switch {
if setjmpbufA == 0 {
schedule; }
} The
setjmp function stores the state of the current thread in bufA, and returns 0. Therefore we enter the
if, and the function schedule is called. This function does the following:
schedule {
bufB = select-thread-to-run longjmpbufB, 1;
} This restores the state that was previously stored in
bufB. Note that this second buffer already contains the state of another thread, that was stored in it by a previous call
to setjmp. The result is that we are again inside the call to setjmp that originally
stored the state in bufB. But this time, setjmp will return a value of 1, not 0 this is
specified by the second argument to longjmp. Thus, when the function returns, the if
surrounding it will fail, and schedule will not be called again immediately. Instead,
execution will continue where it left off before calling the switching function. User-level thread packages, such as pthreads, are based on this type of code. But
they provide a more convenient interface for programmers, enabling them to ignore the complexities of implementing the context switching and scheduling.
58
3.2 Having Multiple Processes in the System