Mutexes Applying This to the Printer Server

The idea is simple: if you want a program to perform more than one operation at a time, you ought to keep the information for each thing the program does. A thread is simply the name for the data structure that does this. There is a slight problem here. Namely, in a single-processor machine, how can a program be capable of doing more than one thing at a time? The answer is that either the JVM via so-called green threads or the operating system via native threads is responsible for making sure that each thread is occasionally active. That is, either the JVM or the OS manages the threads and makes sure that each thread can use a percentage of the processors time. The process of doing this is often called time-slicing or context-switching;the piece of code that does it is often called the thread-scheduler. Time-slicing is a rather expensive process. The price you pay for doing two things at once is the cost of switching between the associated threads and, occasionally, of copying the local caches to the heap. All of this terminology is conveniently, if slightly inaccurately, summed up in the JVMs internal structure illustrated in Figur e 11- 4 . Figure 11-4. Internal structure of the JVM

11.2.4 Mutexes

The final piece of threading terminology we need is the idea of a mutex variable, or mutex. [ 3] A mutex, short for mutual exclusion, is a locking mechanism. Mutex variables have the following three properties: [ 3] Mutexes are also frequently referred to as locks, and I will occasionally do so when the meaning is clear. However, because there are many possible meanings for lock, Ill stick with mutex most of the time. 1. They support at least two operations: lock and unlock often referred to as get and release . 2. Both lock and unlock are unary operations. That is, at most, one call to lock will succeed. After which, all calls to lock will fail until the thread that locked the mutex variable calls unlock . 3. They are global in scope. That is, mutex variables arent copied into local caches of threads.

11.2.5 Applying This to the Printer Server

Lets return to our description of the printer server for a moment. We said we wanted the following behavior: When a print request is received, take the entire document and put it into a print queue on the server machine. Immediately return to handle the next remote method call. At the same time, since marshalling and demarshalling a document can be a lengthy process, and since the document may take a long time to be sent over the network, continue to simultaneously accept other documents and respond to method calls that are querying the servers status. This can be split into three simultaneous tasks: 1. Actual printing, which removes documents from the printer queue and sends them to the printer 2. Answering questions about the status of the printer server 3. Receiving documents And this naturally leads to a thread-based decomposition of the application: • A single thread for actual printing. Because you cant simultaneously print more than one document, more threads would simply cause problems. • A single thread for answering questions about the status of the printer server. This is likely to be a really fast operation, and there arent going to be many questions in a typical use scenario. Since threads do cost us resources, we should probably have only a single thread here, at least until we discover we need more. • Many threads for receiving documents. Since you need to receive more than one document at once, and because receiving a single document isnt likely to stress the server the bottleneck is much more likely to be either the client or the network between the client and the server, you should allocate multiple threads for receiving documents. The last point is slightly deceptive. Even if the server was a bottleneck, and even if introducing threading to respond to multiple print requests at once substantially slowed down the server, its still almost always a good idea to do so. Basically, the decision boils down to choosing one of the following alternatives: The faster application Uses only a single thread to receive documents. Theres less context-switching going on, and the overall amount of processor time devoted to receiving documents is fairly high. However, if a client tries to send a document, and the server is busy, the client simply waits with no feedback. The slower application Uses multiple threads to receive documents. Theres more context-switching going on, and, subsequently, theres less processor time devoted to receiving documents. On the other hand, the client application can display a progress bar to let the user know what percentage of the document has been transmitted. While this may not seem terribly relevant with a simple printer server, this particular design trade- off is ubiquitous in distributed computing. You can maximize application performance, or you can trade some performance in order to tell the user whats going on. It might seem counterintuitive; the faster application is the less-responsive one, but there you have it.

11.3 Threading Concepts