ALU ARITHMETIC LOGIC UNIT
HUMBOLDT-UNIVERSITÄT ZU BERLIN
INSTITUT FÜR INFORMATIK
COMPUTER ARCHITECTURE
Lecture 10
ALU
Sommersemester 2002
Leitung: Prof. Dr. Miroslaw Malek
www.informatik.hu-berlin.de/rok/ca
CA - X - AU - 1
ALU
ARITHMETIC / LOGIC UNIT
•
•
•
•
Arithmetic Units Classification
Number Representations
Hardware/Software Continuum and Vertical Migration
Integer Arithmetic
– addition/subtraction
– multiplication/division
•
•
Decimal Arithmetic Unit
Floating-Point Arithmetic
– addition/subtraction
– multiplication/division
•
Logic Functions
CA - X - AU - 2
TYPES OF ARITHMETIC UNITS
•
SERIAL
– Operations are performed bit by bit. A carry out bit is fed back in the
next cycle. Results are routed to a shift register to assemble a word.
a n −1
A
b n −1
B
•
a0
.......
.......
z n −1
+
b0
......
z0
Z
carry
PARALLEL
– Operands are presented to the unit in parallel. To carry out the
operation circuits may be:
• Sequenced (ripple carry technique)
• Occur concurrently, e.g., carry-lookahead technique
a n −1
A
.......
a0
b n −1
B
32
32
ALU
z n −1
Z
32
......
CA - X - AU - 3
z0
.......
b0
ARITHMETIC UNITS CLASSIFICATION
BY LEVEL OF DESIGN COMPLEXITY
1. Fixed-Point Arithmetic
–
–
–
–
a.
b.
c.
d.
addition/subtraction of positive numbers
addition/subtraction of positive and negative numbers
multiplication
division
2. Decimal Arithmetic (BCD)
– similar to Fixed-Point arithmetic
3. Floating-Point Arithmetic
– a. multiplication
– b. division
– c. addition and subtraction
CA - X - AU - 4
TRADEOFF BETWEEN
HARDWARE/SOFTWARE IMPLEMENTATION
•
ALU units usually as minimum have addition and subtraction, then:
–
–
–
–
Multiplication (fixed)
Division (fixed)
Floating Point
Special Functions/Tables
HARDWARE/SOFTWARE CONTINUUM AND VERTICAL MIGRATION
ADD
SUBTRACT
SHIFT
STORE
HALT
MULTIPLY
DIVIDE
Floating-point
arithemetic
operations
Hardware
Square
root
Polynomial
evaluation
Table search
Matrix operations
Function evaluation
Software
Hardware
Software
Hardware
CA - X - AU - 5
Software
NUMBER REPRESENTATION (1)
Bit pattern
Values represented
----------------------------------------------------------------------------------------------------b3,b2,b1,b0 Sign and
magnitude
1's complement
2's complement
----------------------------------------------------------------------------------------------------0111
+7
+7
+7
0110
+6
+6
+6
0101
+5
+5
+5
0100
+4
+4
+4
0011
+3
+3
+3
0010
+2
+2
+2
0001
+1
+1
+1
0000
+0
+0
+0
1000
-0
-7
-8
1001
-1
-6
-7
1010
-2
-5
-6
1011
-3
-4
-5
1100
-4
-3
-4
1101
-5
-2
-3
1110
-6
-1
-2
1111
-7
-0
-1
----------------------------------------------------------------------------------------------------0
N-1
N-2
0000
1111
1
2
1110
-1
0
-2
1101
0001
0010
+1
+2
-3
-4
1100
+4
-5
1011
0011
+3
-6
+5
-7
-8 +7
0101
+6
0110
1010
(a) Circle representation of
integers mod N
0100
1001
1000
0111
(b) Mod 16 system for 2'scomplement numbers
CA - X - AU - 6
NUMBER REPRESENTATION (2) BCD
1. Binary-Coded Decimal (BCD) can represent the numbers 0 through 9 in 4
4
binary bits. Arithmetic is accomplished modulo 10. Since 4 bits = 2 =
16, numbers greater than 10 are adjusted by adding 6 (=0110), (16-10 = 6).
– 1010 is used for “+” and 1011 for “-”
• Example:
in BCD-Code:
4739+1281=6020
0100
+ 0001
*( 0101
0111
0010
1001
0011
1000
1011
1001
0001
1010)
The number needs to be adjusted by adding 0110:
0101
1001
0101
1001
1
1010
0110
0000
0
0101
1
0110
6
1011
1
1100
0110
0010
1010
0110
0000
0010
2
0000
0
CA - X - AU - 7
0000
NUMBER REPRESENTATION (3)
BINARY REPRESENTATION
1. Position and magnitude
– B=bn-1 ......b1b0
n-1
1
0
– V(B)= bn-1 2 +...+b12 +b02
2. Signed numbers
bn-1=0 positive
n - number of bits
bn-1=1 negative
N - the actual number
– Sign & Magnitude
bn-1
magnitude
S
– 1's Complement N
n
N=(2 -1) - N
b0
negation
– 2's Complement N*
n
n
N*= 2 - N = (2 - 1) - N+1 = N + 1 negation plus one
CA - X - AU - 8
FRACTIONAL 2's COMPLEMENT REPRESENTATION (I)
N* = 2
•
n+1
-N
FRACTIONS FORM
X.XXX . . . X
0.-1-2 . . . . -m
•
FOR FRACTION
n=0
N* =21 - N
•
EXAMPLE
Let
N = 0.0100101
-
2
=
10.0000000
(N)
N*
=
=
0.0100101
1.1011011
CA - X - AU - 9
FRACTIONAL 2's COMPLEMENT REPRESENTATION (II)
decim. binary
positive
0
0.000
.125
0.001
.250
0.010
.375
0.011
.500
0.100
.625
0.101
.750
0.110
.875
0.111
decim. binary
negative
-.875
-.750
-.625
-.500
-.375
-.250
-.125
1.001
1.010
1.011
1.100
1.101
1.110
1.111
N* = 2 - N
-
2
=
(.375) =
-.375 =
10.000
- .011
1.101
.375 = 0.011
+ (.250) = 0.010
.625 = 0.101
* carry ignored
.625 = 0.101
+ (-.125) = 1.111
.500 =*0.100
.375 = 0.011 = 0.011
+ (-.250) =-0.010 =1.110
.125 = 0.001 = 0.001
CA - X - AU - 10
ARITHMETIC OPERATIONS
1. Addition of positive numbers
2. Addition/subtraction of positive and negative numbers
3. Multiplication
a. positive numbers
b. signed numbers
4. Division (Integer)
5. Floating point
CA - X - AU - 11
EXECUTION TIME
Execution time = ∑ Logic Gate Delay
- Assume any stage of an n-bit serial adder requires 5 ns
- A 32-bit add takes
32 x 2 ns = 64 ns
- Memory access may be 5 ns (basic cycle)
- We want to improve the add speed to fall below
the basic cycle speed:
- Faster logic
- Accelerating the carry
The carry causes delay, so the basic problem is to
calculate the carry more rapidly by looking ahead for it.
(e.g., Carry Lookahead logic facilitates increased speed
of operation)
CA - X - AU - 12
N-bit adder/subtractor with
2's-complement ADD/SUBTRACT control
2's-complement's big advantage:
same circuit for add and sub
ADD
SUB
xn-1
x1
cn
→
→
S=X+Y
S = X + Y +1
y1
yn-1
x0
y0
ADD/SUB
control
ADD = 0
SUB = 1
n-bit adder
sn-1
s1
CA - X - AU - 13
s0
LOGIC FOR ADDING TWO BITS
si
= x i yi c i + x i y i c i + xi y i c i + x iyi ci
c i+1 = xi c i + yi c i + x i yi
si
= (xi + yi +ci) mod 2
ci+1 = (xi + yi +ci) / 2
xi
yi
ci
xi
ci
x i
yi
c i
x i
yi
c i
xi
yi
ci
yi
ci
c
xi
yi
x
c
y
i
i
ADDER (A)
i
si
CA - X - AU - 14
c
i+1
i+1
FAST ADDER DESIGN
The logic equations for two level logic expressions are:
(1)
si =
xi yici + xi yi ci + xi yi ci + xi yici
ci+1 = xici + yici + xi yi
Factoring the second of these (carry-out equation) into
ci +1 = xi yi + ( xi + yi )ci
(2)
and defining a generate function
(3)
Gi = xi yi
and a propagate function
(4)
Pi = xi + yi
we can write
(5) ci +1 = Gi + Pi ci
ci −1+1 = Gi −1 + Pi −1ci −1
(6)
ci = Gi −1 + Pi −1ci −1
ci +1 = Gi + Pi (Gi −1 + Pi −1ci −1 )
CA - X - AU - 15
(7)
c i + 1 = G i + Pi G i − 1 + Pi Pi − 1 c i − 1
(8)
c i + 1 = G i + P i G i − 1 + P i P i − 1 G i − 2 + ... + P i ... P1 G 0 + P i ... P 0 c 0
Pure Carry Lookahead circuit for computing
the carry out cn of an n-bit adder
Gi-1 Pi-1
Gi-2
Pi-2 Gi-3
P1 G0
P0 c 0
ci
c i = G i − 1 + P i − 1 G i − 2 + ... + P i − 1 ... P1 G 0 + P i − 1 ... P 0 c 0
CA - X - AU - 16
BLOCK LOOKAHEAD
~ P
~
G
,
K K
~ = P P PP
P
3 2 1 0
o
~ = G + P G + P P G + P P PG
G
3
3 2
3 2 1
3 2 1 0
0
K=0 1st Block
K=1 2nd Block
etc.
Carry for a 16-bit adder:
~ +P
~
~ G
~ ~ ~
~ ~ ~ ~
~ ~ ~ ~
c16 = G
3
3 2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 c0
Example: a 4-bit adder
c 4 = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1G 0 + P 3 P 2 P 1 P 0 c 0
x3 y3
x2 y2
x1 y1
x0 y0
Control
inputs
P1
G1
c4
c0
s3
s2
s1
4-bit integrated ALU block
CA - X - AU - 17
s0
TIMING FOR AN ADDITION OPERATION BASED ON
CARRY LOOKAHEAD
Two expressions must be evaluated
a) Carry Lookahead
c i = G i − 1 + P i − 1 G i − 2 + ... + P i − 1 ... P 1 G 0 + P i − 1 ... P 0 c 0
b) The Sum
si = x i y ic i + xiy i c i + xi y i c i + x i y ic i
Ad a) A carry can be generated in three logic gate delays.
1 Compute Pi, Gi
1 AND P's, G's
1 OR resulting AND P's, G's
3
Ad b) The completion of the sum can be generated in three
additional logic gate delays.
1 Form ci
1 AND x i , y i , c i , x i , y i , c i
1 OR AND Products
3
CA - X - AU - 18
LIMITATION
a) Carry Lookahead with 4 blocks (32 bit, k=8)
Gate fan-in is limited to 8 (usual circuit constraint)
Generate Gi
Generate Pi
1
delays
2
Form C8
2
Form C16
2
Form C24
2
Form C31
3
Form S31
12 delays at 5 ns per 1 gate → 60 ns for an add.
b) Carry Lookahead fully integrated (32 bits, k=32)
without circuit constraints
delays
2
Generate Gk
2
Form C31
3
Form S31
Generate Pk
7 delays at 5 ns per 1 gate → 35 ns for an add
CA - X - AU - 19
1
CARRY SKIP ADDER
a3b3 ...
a15b15
c12
c4
c8
P12,15
P8,11
a0b0
c0
P4,7
Notation: Pi, j = Pi ⋅ Pi+1 ⋅⋅⋅ Pj
CARRY SELECT ADDER
a7b7
a6b6
a5b5 a b
4 4
a3b3
a2b2
a1b1 a b
0 0
0
c0
a4b4
1
s7
s6
s5
c4
s4
s3
s2
s1
- two additions are performed in parallel, one assuming carry 0
the other assuming carry 1
- when the carry is finally known, correct sum is selected
CA - X - AU - 20
s0
SUMMARY - ADDITION TECHNIQUES
Serial addition
simple logic ↔ slow execution
Parallel addition
complex logic ↔ faster execution
- Ripple carry
- Carry lookahead
- Carry skip
- Carry select
Technique
Time
Space
Ripple
CLA (Carry Lookahead)
Carry skip
Carry select
O(n)
O(log n)
O(n)
O(n log n)
O(n)
O(n)
O( n)
O( n)
CA - X - AU - 21
INSTITUT FÜR INFORMATIK
COMPUTER ARCHITECTURE
Lecture 10
ALU
Sommersemester 2002
Leitung: Prof. Dr. Miroslaw Malek
www.informatik.hu-berlin.de/rok/ca
CA - X - AU - 1
ALU
ARITHMETIC / LOGIC UNIT
•
•
•
•
Arithmetic Units Classification
Number Representations
Hardware/Software Continuum and Vertical Migration
Integer Arithmetic
– addition/subtraction
– multiplication/division
•
•
Decimal Arithmetic Unit
Floating-Point Arithmetic
– addition/subtraction
– multiplication/division
•
Logic Functions
CA - X - AU - 2
TYPES OF ARITHMETIC UNITS
•
SERIAL
– Operations are performed bit by bit. A carry out bit is fed back in the
next cycle. Results are routed to a shift register to assemble a word.
a n −1
A
b n −1
B
•
a0
.......
.......
z n −1
+
b0
......
z0
Z
carry
PARALLEL
– Operands are presented to the unit in parallel. To carry out the
operation circuits may be:
• Sequenced (ripple carry technique)
• Occur concurrently, e.g., carry-lookahead technique
a n −1
A
.......
a0
b n −1
B
32
32
ALU
z n −1
Z
32
......
CA - X - AU - 3
z0
.......
b0
ARITHMETIC UNITS CLASSIFICATION
BY LEVEL OF DESIGN COMPLEXITY
1. Fixed-Point Arithmetic
–
–
–
–
a.
b.
c.
d.
addition/subtraction of positive numbers
addition/subtraction of positive and negative numbers
multiplication
division
2. Decimal Arithmetic (BCD)
– similar to Fixed-Point arithmetic
3. Floating-Point Arithmetic
– a. multiplication
– b. division
– c. addition and subtraction
CA - X - AU - 4
TRADEOFF BETWEEN
HARDWARE/SOFTWARE IMPLEMENTATION
•
ALU units usually as minimum have addition and subtraction, then:
–
–
–
–
Multiplication (fixed)
Division (fixed)
Floating Point
Special Functions/Tables
HARDWARE/SOFTWARE CONTINUUM AND VERTICAL MIGRATION
ADD
SUBTRACT
SHIFT
STORE
HALT
MULTIPLY
DIVIDE
Floating-point
arithemetic
operations
Hardware
Square
root
Polynomial
evaluation
Table search
Matrix operations
Function evaluation
Software
Hardware
Software
Hardware
CA - X - AU - 5
Software
NUMBER REPRESENTATION (1)
Bit pattern
Values represented
----------------------------------------------------------------------------------------------------b3,b2,b1,b0 Sign and
magnitude
1's complement
2's complement
----------------------------------------------------------------------------------------------------0111
+7
+7
+7
0110
+6
+6
+6
0101
+5
+5
+5
0100
+4
+4
+4
0011
+3
+3
+3
0010
+2
+2
+2
0001
+1
+1
+1
0000
+0
+0
+0
1000
-0
-7
-8
1001
-1
-6
-7
1010
-2
-5
-6
1011
-3
-4
-5
1100
-4
-3
-4
1101
-5
-2
-3
1110
-6
-1
-2
1111
-7
-0
-1
----------------------------------------------------------------------------------------------------0
N-1
N-2
0000
1111
1
2
1110
-1
0
-2
1101
0001
0010
+1
+2
-3
-4
1100
+4
-5
1011
0011
+3
-6
+5
-7
-8 +7
0101
+6
0110
1010
(a) Circle representation of
integers mod N
0100
1001
1000
0111
(b) Mod 16 system for 2'scomplement numbers
CA - X - AU - 6
NUMBER REPRESENTATION (2) BCD
1. Binary-Coded Decimal (BCD) can represent the numbers 0 through 9 in 4
4
binary bits. Arithmetic is accomplished modulo 10. Since 4 bits = 2 =
16, numbers greater than 10 are adjusted by adding 6 (=0110), (16-10 = 6).
– 1010 is used for “+” and 1011 for “-”
• Example:
in BCD-Code:
4739+1281=6020
0100
+ 0001
*( 0101
0111
0010
1001
0011
1000
1011
1001
0001
1010)
The number needs to be adjusted by adding 0110:
0101
1001
0101
1001
1
1010
0110
0000
0
0101
1
0110
6
1011
1
1100
0110
0010
1010
0110
0000
0010
2
0000
0
CA - X - AU - 7
0000
NUMBER REPRESENTATION (3)
BINARY REPRESENTATION
1. Position and magnitude
– B=bn-1 ......b1b0
n-1
1
0
– V(B)= bn-1 2 +...+b12 +b02
2. Signed numbers
bn-1=0 positive
n - number of bits
bn-1=1 negative
N - the actual number
– Sign & Magnitude
bn-1
magnitude
S
– 1's Complement N
n
N=(2 -1) - N
b0
negation
– 2's Complement N*
n
n
N*= 2 - N = (2 - 1) - N+1 = N + 1 negation plus one
CA - X - AU - 8
FRACTIONAL 2's COMPLEMENT REPRESENTATION (I)
N* = 2
•
n+1
-N
FRACTIONS FORM
X.XXX . . . X
0.-1-2 . . . . -m
•
FOR FRACTION
n=0
N* =21 - N
•
EXAMPLE
Let
N = 0.0100101
-
2
=
10.0000000
(N)
N*
=
=
0.0100101
1.1011011
CA - X - AU - 9
FRACTIONAL 2's COMPLEMENT REPRESENTATION (II)
decim. binary
positive
0
0.000
.125
0.001
.250
0.010
.375
0.011
.500
0.100
.625
0.101
.750
0.110
.875
0.111
decim. binary
negative
-.875
-.750
-.625
-.500
-.375
-.250
-.125
1.001
1.010
1.011
1.100
1.101
1.110
1.111
N* = 2 - N
-
2
=
(.375) =
-.375 =
10.000
- .011
1.101
.375 = 0.011
+ (.250) = 0.010
.625 = 0.101
* carry ignored
.625 = 0.101
+ (-.125) = 1.111
.500 =*0.100
.375 = 0.011 = 0.011
+ (-.250) =-0.010 =1.110
.125 = 0.001 = 0.001
CA - X - AU - 10
ARITHMETIC OPERATIONS
1. Addition of positive numbers
2. Addition/subtraction of positive and negative numbers
3. Multiplication
a. positive numbers
b. signed numbers
4. Division (Integer)
5. Floating point
CA - X - AU - 11
EXECUTION TIME
Execution time = ∑ Logic Gate Delay
- Assume any stage of an n-bit serial adder requires 5 ns
- A 32-bit add takes
32 x 2 ns = 64 ns
- Memory access may be 5 ns (basic cycle)
- We want to improve the add speed to fall below
the basic cycle speed:
- Faster logic
- Accelerating the carry
The carry causes delay, so the basic problem is to
calculate the carry more rapidly by looking ahead for it.
(e.g., Carry Lookahead logic facilitates increased speed
of operation)
CA - X - AU - 12
N-bit adder/subtractor with
2's-complement ADD/SUBTRACT control
2's-complement's big advantage:
same circuit for add and sub
ADD
SUB
xn-1
x1
cn
→
→
S=X+Y
S = X + Y +1
y1
yn-1
x0
y0
ADD/SUB
control
ADD = 0
SUB = 1
n-bit adder
sn-1
s1
CA - X - AU - 13
s0
LOGIC FOR ADDING TWO BITS
si
= x i yi c i + x i y i c i + xi y i c i + x iyi ci
c i+1 = xi c i + yi c i + x i yi
si
= (xi + yi +ci) mod 2
ci+1 = (xi + yi +ci) / 2
xi
yi
ci
xi
ci
x i
yi
c i
x i
yi
c i
xi
yi
ci
yi
ci
c
xi
yi
x
c
y
i
i
ADDER (A)
i
si
CA - X - AU - 14
c
i+1
i+1
FAST ADDER DESIGN
The logic equations for two level logic expressions are:
(1)
si =
xi yici + xi yi ci + xi yi ci + xi yici
ci+1 = xici + yici + xi yi
Factoring the second of these (carry-out equation) into
ci +1 = xi yi + ( xi + yi )ci
(2)
and defining a generate function
(3)
Gi = xi yi
and a propagate function
(4)
Pi = xi + yi
we can write
(5) ci +1 = Gi + Pi ci
ci −1+1 = Gi −1 + Pi −1ci −1
(6)
ci = Gi −1 + Pi −1ci −1
ci +1 = Gi + Pi (Gi −1 + Pi −1ci −1 )
CA - X - AU - 15
(7)
c i + 1 = G i + Pi G i − 1 + Pi Pi − 1 c i − 1
(8)
c i + 1 = G i + P i G i − 1 + P i P i − 1 G i − 2 + ... + P i ... P1 G 0 + P i ... P 0 c 0
Pure Carry Lookahead circuit for computing
the carry out cn of an n-bit adder
Gi-1 Pi-1
Gi-2
Pi-2 Gi-3
P1 G0
P0 c 0
ci
c i = G i − 1 + P i − 1 G i − 2 + ... + P i − 1 ... P1 G 0 + P i − 1 ... P 0 c 0
CA - X - AU - 16
BLOCK LOOKAHEAD
~ P
~
G
,
K K
~ = P P PP
P
3 2 1 0
o
~ = G + P G + P P G + P P PG
G
3
3 2
3 2 1
3 2 1 0
0
K=0 1st Block
K=1 2nd Block
etc.
Carry for a 16-bit adder:
~ +P
~
~ G
~ ~ ~
~ ~ ~ ~
~ ~ ~ ~
c16 = G
3
3 2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 c0
Example: a 4-bit adder
c 4 = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1G 0 + P 3 P 2 P 1 P 0 c 0
x3 y3
x2 y2
x1 y1
x0 y0
Control
inputs
P1
G1
c4
c0
s3
s2
s1
4-bit integrated ALU block
CA - X - AU - 17
s0
TIMING FOR AN ADDITION OPERATION BASED ON
CARRY LOOKAHEAD
Two expressions must be evaluated
a) Carry Lookahead
c i = G i − 1 + P i − 1 G i − 2 + ... + P i − 1 ... P 1 G 0 + P i − 1 ... P 0 c 0
b) The Sum
si = x i y ic i + xiy i c i + xi y i c i + x i y ic i
Ad a) A carry can be generated in three logic gate delays.
1 Compute Pi, Gi
1 AND P's, G's
1 OR resulting AND P's, G's
3
Ad b) The completion of the sum can be generated in three
additional logic gate delays.
1 Form ci
1 AND x i , y i , c i , x i , y i , c i
1 OR AND Products
3
CA - X - AU - 18
LIMITATION
a) Carry Lookahead with 4 blocks (32 bit, k=8)
Gate fan-in is limited to 8 (usual circuit constraint)
Generate Gi
Generate Pi
1
delays
2
Form C8
2
Form C16
2
Form C24
2
Form C31
3
Form S31
12 delays at 5 ns per 1 gate → 60 ns for an add.
b) Carry Lookahead fully integrated (32 bits, k=32)
without circuit constraints
delays
2
Generate Gk
2
Form C31
3
Form S31
Generate Pk
7 delays at 5 ns per 1 gate → 35 ns for an add
CA - X - AU - 19
1
CARRY SKIP ADDER
a3b3 ...
a15b15
c12
c4
c8
P12,15
P8,11
a0b0
c0
P4,7
Notation: Pi, j = Pi ⋅ Pi+1 ⋅⋅⋅ Pj
CARRY SELECT ADDER
a7b7
a6b6
a5b5 a b
4 4
a3b3
a2b2
a1b1 a b
0 0
0
c0
a4b4
1
s7
s6
s5
c4
s4
s3
s2
s1
- two additions are performed in parallel, one assuming carry 0
the other assuming carry 1
- when the carry is finally known, correct sum is selected
CA - X - AU - 20
s0
SUMMARY - ADDITION TECHNIQUES
Serial addition
simple logic ↔ slow execution
Parallel addition
complex logic ↔ faster execution
- Ripple carry
- Carry lookahead
- Carry skip
- Carry select
Technique
Time
Space
Ripple
CLA (Carry Lookahead)
Carry skip
Carry select
O(n)
O(log n)
O(n)
O(n log n)
O(n)
O(n)
O( n)
O( n)
CA - X - AU - 21