Dual Round-Robin Matching DRRM

INPUT-BUFFERED SWITCHES 62 follows: 1. Each unmatched input sends a request to every output for which it has a queued cell. 2. If an unmatched output receives multiple requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest-priority element. The output notifies each input whether or not its request was granted. 3. If an input receives multiple grants, it accepts the one that appears next in a fixed round-robin schedule starting from the highest-priority Ž . element. The pointer a is incremented modulo N to one location j beyond the accepted output. The accept pointers a are updated only in i the first iteration. Ž . 4. The grant pointer g is incremented modulo N to one location i beyond the granted input if and only if the grant is accepted in step 3 of the first iteration. Like the accept pointers, the pointers g are updated i only in the first iteration. Because of the round-robin moving of the pointers, we expect the algo- rithm to provide a fair allocation of bandwidth among all flows. This scheme contains 2 N arbiters, each of which is implementable with low complexity. The throughput offered with this algorithm is 100 for any number of Ž . iterations, due to the desynchronization effect see Section 3.3.4 . A matching example of this scheme is shown in Figure 3.9. Considering the example from the iRRM discussion, initially all pointers a and g are set to 1. In step 2 of j i iSLIP, the output accepts the request that is closer to the pointed input in a clockwise direction; however, in a manner different from iRRM, the pointers g are not updated in this step. They wait for the acceptance result. In step 3, i the inputs accept the grant that is closer to the one pointed to by a . The i accept pointers change to one position beyond the accepted one, a s 2, 1 a s 1, a s 1, and a s 1. Then, after the accept pointers decide which 2 3 4 grant is accepted, the grant pointers change to one position beyond the Ž accepted grant i.e., a nonaccepted grant produces no change in a grant . pointer position . The new values for these pointers are g s 2, g s 1, 1 2 g s 4 and g s 1. In the following iterations, only the unmatched input and 3 4 Ž outputs are considered and the pointers are not modified i.e., updating . occurs in the first iteration only .

3.3.4 Dual Round-Robin Matching DRRM

w x The DRRM scheme 3, 4 works similarly to iSLIP, also using the round-robin selection instead of random selection. But it starts the round-robin selection at inputs. An input arbiter is used to select a nonempty VOQ according to the round-robin service discipline. After the selection, each input sends a request, if necessary, to the destined output arbiter. An output arbiter receives up to N requests. It chooses one of them based on the round-robin SCHEDULING ALGORITHMS 63 Ž . Fig. 3.10 An example of the dual round-robin scheduling algorithm. 䊚2000 IEEE. service discipline, and sends a grant to the winner input port. Because of the two sets of independent round-robin arbiters, this arbitration scheme is Ž . called dual round-robin DRR arbitration. Ž . The DRR arbitration has four steps in a cycle: 1 each input arbiter Ž . performs request selection, and 2 sends a request to the output arbiters; Ž . Ž . 3 each output arbiter performs grant arbitration, and 4 the output arbiters send grant signals to input arbiters. Figure 3.10 shows an example of the DRR arbitration algorithm. In a request phase, each input chooses a VOQ and sends a request to an output arbiter. Assume input 1 has cells destined for both outputs 1 and 2. Since its round-robin pointer, r , is pointing to 1, 1 input arbiter 1 sends a request to output 1 and updates its pointer to 2. Let us consider output 3 in the grant phase. Since its round-robin pointer, g , is 3 pointing to 3, output arbiter 3 grants access to input 3 and updates its pointer to 4. Like iSLIP, DRRM has the desynchronization effect. The input arbiters granted in different time slots have different pointer values, and each of them requests a different output, resulting in desynchronization. However, the DRR scheme requires less time to do arbitration and is easier to implement. This is because less information exchange is needed between input arbiters and output arbiters. In other words, DRRM saves the initial transmission time required to send requests from inputs to outputs in iSLIP. Consider the fully loaded situation in which every VOQ always has cells. Figure 3.11 shows the HOL cells chosen from each input port in different time slots. In time slot 1, each input port chooses a cell destined for output Ž . A. Among those three cells, only one the first one in this example is granted Ž . and the other two have to wait at HOL. The round-robin RR pointer of the INPUT-BUFFERED SWITCHES 64 Fig. 3.11 The desynchronization effect of DRRM under the fully loaded situation. Ž . Only HOL cells at each input are shown for illustration. 䊚2000 IEEE. Fig. 3.12 Comparison of tail probability of input delay under three arbitration schemes. SCHEDULING ALGORITHMS 65 first input advances to point to output B in time slot 2, and a cell destined for B is chosen and then granted because of no contenders. The other two inputs have their HOL cells unchanged, both destined for output A. Only Ž . one of them the one from the second input is granted, and the other has to wait until the third time slot. At that time, the round-robin pointers among the three inputs have been desynchronized and point to C, B, and A, respectively. As a result, all three cells chosen are granted. Ž Figure 3.12 shows the tail probability under FIFO q RR FIFO for input . selection and RR for round-robin arbitration , DRR, and iSLIP arbitration schemes. The switch size is 256, and the average burst length is 10 cell slots Ž . with the on᎐off model . DRR’s and iSLIP’s performance are comparable at a speedup of 2, while all three schemes have almost the same performance at speedups c G 3.

3.3.5 Round-Robin Greedy Scheduling