Input- and Output-Buffered Switches Virtual-Output-Queueing Switches

BASICS OF PACKET SWITCHING 36 throughput significantly, but increases the implementation complexity of input buffers and arbitration mechanism. This is because the input buffers cannot use simple FIFO memory any longer, and more cells need to be arbitrated in each time slot. Several techniques have been proposed to increase the throughput and are discussed in detail in Chapter 3.

2.2.3.5 Output-Buffered Switches

The output-buffered switch, shown Ž . in Figure 2.15 e , allows all incoming cells to arrive at the output port. Because there is no HOL blocking, the switch can achieve 100 throughput. However, since the output buffer needs to store N cells in each time slot, its memory speed will limit the switch size. A concentrator can be used to alleviate the memory speed limitation problem so as to have a larger switch size. The disadvantage of this remedy is the inevitable cell loss in the concentrator. Ž . 2.2.3.6 Shared-Buffer Switches Figure 2.15 f shows the shared buffer switch, which will be discussed in detail in Chapter 4.

2.2.3.7 Multistage Shared-Buffer Switches

The shared-buffer architec- ture has been widely used to implement small-scale switches because of its high throughput, low delay, and high memory utilization. Although a large- scale switch can be realized by interconnecting multiple shared-buffer switch Ž . modules, as shown in Figure 2.15 g , the system performance is degraded due to the internal blocking. Due to different queue lengths in the first- and second-stage modules, maintaining cell sequence at the output module can be very complex and expensive.

2.2.3.8 Input- and Output-Buffered Switches

Input- and output- Ž . buffered switches, as shown in Figure 2.15 h , are intended to combine the advantages of input buffering and output buffering. In input buffering, the input buffer speed is comparable to the input line rate. In output buffering, Ž . there are up to L 1 - L - N cells that each output port can accept at each time slot. If there are more than L cells destined for the same output port, excess cells are stored in the input buffers instead of discarding them as in the concentrator. To achieve a desired throughput, the speedup factor L can be engineered based on the input traffic distribution. Since the output buffer memory only needs to operate at L times the line rate, a large-scale switch can be achieved by using input and output buffering. However, this type of switch requires a complicated arbitration mechanism to determine which of L cells among the N HOL cells may go to the output port. Another kind of speedup is run the switch fabric at a higher rate than the input and output line rate. In other words, during each cell slot, there can be more than one cell transmitted from an input to an output.

2.2.3.9 Virtual-Output-Queueing Switches

Virtual-output-queueing Ž . VOQ switches are proposed as a way to solve the HOL blocking problem Ž . encountered in the input-buffered switches. As shown in Figure 2.15 i , each PEFORMANCE OF BASIC SWITCHES 37 input buffer of the switch is logically divided into N logical queues. All these N logical queues of the input buffer share the same physical memory, and each contains the cells destined to each output port. The HOL blocking is thus reduced, and the throughput is increased. However, this type of switch requires a fast and intelligent arbitration mechanism. Since the HOL cells of all logical queues in the input buffers, whose total number is N 2 , need to be arbitrated in each time slot, that becomes the bottleneck of the switch. Several scheduling schemes for the switch using the VOQ structure are discussed in detail in Chapter 3.

2.3 PERFORMANCE OF BASIC SWITCHES

This section describes performance of three basic switches: input-buffered, output-buffered, and completely shared-buffer.

2.3.1 Input-Buffered Switches

We consider FIFO buffers in evaluating the performance of input queuing. We assume that only the cells at the head of the buffers can contend for their destined outputs. If there are more than one cell contending the same output, only one of them is allowed to pass through the switch and the others have to wait until the next time slot. When a HOL cell loses contention, at the same moment it may also block some cells behind it from reaching idle outputs. As a result, the maximum throughput of the switch is limited and cannot be 100. To determine the maximum throughput, we assume that all the input queues are saturated. That is, there are always cells waiting in each input buffer, and whenever a cell is transmitted through the switch, a new cell immediately replaces it at the head of the input queue. If there are k cells waiting at the heads of input queues addressed to the same output, one of them will be selected at random to pass through the switch. In other words, Ž . each of the HOL cells has equal probability 1 rk of being selected. Consider all N cells at the heads of input buffers at time slot m. Depending on the destinations, they can be classified into N groups. Some groups may have more than one cell, and some may have none. For those that have more than one cell, one of the cells will be selected to pass through the switch, and the remaining cells have to stay until the next time slot. Denote by B i the number of remaining cells destined for output i in the m mth time slot, and by B i the corresponding random variable in the steady state. Also, denote by A i the number of cells moving to the heads of the m input queues during the mth time slot and destined for output i, and by A i the corresponding steady-state random variable. Note that a cell can only move to the head of an input queue if the HOL cell in the previous time slot was removed from that queue for transmission to an output. Hence, the state