Multiline Input Smoothing Speedup Parallel Switch

METHODS FOR IMPROVING PERFORMANCE 53 and the offered load ␳ is the portion of time that a time slot is active: 1rp q ␳ s s . j ⬁ q q p y pq 1rp q Ý jq 1 y q Ž . js0 Under bursty traffic, it has been shown that, when N is large, the throughput of an input-buffered switch is between 0.5 and 0.586, depending on the w x burstiness 19 .

3.2 METHODS FOR IMPROVING PERFORMANCE

The throughput limitation of an input-buffered switch is primarily due to the bandwidth constraint and the inefficiency of scheduling cells from inputs to outputs. To improve the throughput performance, we can either develop more efficient scheduling methods or simply increase the internal capacity.

3.2.1 Increasing Internal Capacity

3.2.1.1 Multiline Input Smoothing

Figure 3.3 illustrates an arrange- ment where the cells within a frame of b time slots at each of the N inputs w x are simultaneously launched into a switch fabric of size Nb = Nb 13, 9 . At most Nb cells enter the fabric, of which b can be simultaneously received at each output. In this architecture, the out-of-sequence problem may occur at any output buffer. Although intellectually interesting, input smoothing does not seem to have much practical value.

3.2.1.2 Speedup

A speedup factor of c means that the switch fabric runs w x c times as fast as the input and output ports 20, 12 . A time slot is further divided into c cycles, and cells are transferred from inputs to outputs in every Ž . Ž . cycle. Each input output can transmit accept c cells in a time slot. Fig. 3.3 Input smoothing. INPUT-BUFFERED SWITCHES 54 Simulation studies show that a speedup factor of 2 yields 100 throughput w x 20, 12 . There is another meaning when people talk about ‘‘speedup’’ in the literature. At most one cell can be transferred from an input in a time slot, but during the same period of time an output can accept up to c cells w x 27, 6, 28 . In bursty traffic mode, a factor of 2 only achieves 82.8 to 88.5 Ž . throughput, depending on the degree of input traffic correlation burstiness w x 19 .

3.2.1.3 Parallel Switch

The parallel switch consists of K identical switch w x planes 21 . Each switch plane has its own input buffer and shares output buffers with other planes. The parallel switch with K s 2 achieves the maximum throughput of 1.0. This is because the maximum throughput of each switch plane is more than 0.586 for arbitrary switch size N. Since each input port distributes cells to different switch planes, the cell sequence is out of order at the output port. This type of parallel switch requires timestamps, and cell sequence regeneration at the output buffers. In addition, the hardware resources needed to implement the switch are K times as much as for a single switch plane.

3.2.2 Increasing Scheduling Efficiency