BASICS OF PACKET SWITCHING
36
throughput significantly, but increases the implementation complexity of input buffers and arbitration mechanism. This is because the input buffers
cannot use simple FIFO memory any longer, and more cells need to be arbitrated in each time slot. Several techniques have been proposed to
increase the throughput and are discussed in detail in Chapter 3.
2.2.3.5 Output-Buffered Switches
The output-buffered switch, shown Ž .
in Figure 2.15 e , allows all incoming cells to arrive at the output port. Because there is no HOL blocking, the switch can achieve 100 throughput.
However, since the output buffer needs to store N cells in each time slot, its memory speed will limit the switch size. A concentrator can be used to
alleviate the memory speed limitation problem so as to have a larger switch size. The disadvantage of this remedy is the inevitable cell loss in the
concentrator.
Ž . 2.2.3.6
Shared-Buffer Switches Figure 2.15 f shows the shared buffer
switch, which will be discussed in detail in Chapter 4.
2.2.3.7 Multistage Shared-Buffer Switches
The shared-buffer architec- ture has been widely used to implement small-scale switches because of its
high throughput, low delay, and high memory utilization. Although a large- scale switch can be realized by interconnecting multiple shared-buffer switch
Ž . modules, as shown in Figure 2.15 g , the system performance is degraded due
to the internal blocking. Due to different queue lengths in the first- and second-stage modules, maintaining cell sequence at the output module can
be very complex and expensive.
2.2.3.8 Input- and Output-Buffered Switches
Input- and output- Ž .
buffered switches, as shown in Figure 2.15 h , are intended to combine the advantages of input buffering and output buffering. In input buffering, the
input buffer speed is comparable to the input line rate. In output buffering, Ž
. there are up to L 1 - L - N cells that each output port can accept at each
time slot. If there are more than L cells destined for the same output port, excess cells are stored in the input buffers instead of discarding them as in
the concentrator. To achieve a desired throughput, the speedup factor L can be engineered based on the input traffic distribution. Since the output buffer
memory only needs to operate at L times the line rate, a large-scale switch can be achieved by using input and output buffering. However, this type of
switch requires a complicated arbitration mechanism to determine which of L cells among the N HOL cells may go to the output port.
Another kind of speedup is run the switch fabric at a higher rate than the input and output line rate. In other words, during each cell slot, there can be
more than one cell transmitted from an input to an output.
2.2.3.9 Virtual-Output-Queueing Switches
Virtual-output-queueing Ž
. VOQ switches are proposed as a way to solve the HOL blocking problem
Ž . encountered in the input-buffered switches. As shown in Figure 2.15 i , each
PEFORMANCE OF BASIC SWITCHES
37
input buffer of the switch is logically divided into N logical queues. All these N logical queues of the input buffer share the same physical memory, and
each contains the cells destined to each output port. The HOL blocking is thus reduced, and the throughput is increased. However, this type of switch
requires a fast and intelligent arbitration mechanism. Since the HOL cells of all logical queues in the input buffers, whose total number is N
2
, need to be arbitrated in each time slot, that becomes the bottleneck of the switch.
Several scheduling schemes for the switch using the VOQ structure are discussed in detail in Chapter 3.
2.3 PERFORMANCE OF BASIC SWITCHES
This section describes performance of three basic switches: input-buffered, output-buffered, and completely shared-buffer.
2.3.1 Input-Buffered Switches
We consider FIFO buffers in evaluating the performance of input queuing. We assume that only the cells at the head of the buffers can contend for their
destined outputs. If there are more than one cell contending the same output, only one of them is allowed to pass through the switch and the others
have to wait until the next time slot. When a HOL cell loses contention, at the same moment it may also block some cells behind it from reaching idle
outputs. As a result, the maximum throughput of the switch is limited and cannot be 100.
To determine the maximum throughput, we assume that all the input queues are saturated. That is, there are always cells waiting in each input
buffer, and whenever a cell is transmitted through the switch, a new cell immediately replaces it at the head of the input queue. If there are k cells
waiting at the heads of input queues addressed to the same output, one of them will be selected at random to pass through the switch. In other words,
Ž .
each of the HOL cells has equal probability 1 rk of being selected.
Consider all N cells at the heads of input buffers at time slot m. Depending on the destinations, they can be classified into N groups. Some
groups may have more than one cell, and some may have none. For those that have more than one cell, one of the cells will be selected to pass through
the switch, and the remaining cells have to stay until the next time slot. Denote by B
i
the number of remaining cells destined for output i in the
m
mth time slot, and by B
i
the corresponding random variable in the steady state. Also, denote by A
i
the number of cells moving to the heads of the
m
input queues during the mth time slot and destined for output i, and by A
i
the corresponding steady-state random variable. Note that a cell can only move to the head of an input queue if the HOL cell in the previous time slot
was removed from that queue for transmission to an output. Hence, the state