Chip Organization Trends in Computer Architecture

258 CHAPT ER 7 MEMORY may run for days before an error occurs. For this reason, early personal comput- ers PCs did not use error detection circuitry, since PCs would be turned off at the end of the day, and so undetected errors would not accumulate. T his helped to keep the prices of PCs competitive. With the drastic reduction in DRAM prices and the increased uptimes of PCs operating as automated teller machines AT Ms and network file servers NFSs, error detection circuitry is now com- monplace in PCs. In the next section we explore how RAM cells are organized into chips.

7.3 Chip Organization

A simplified pinout of a RAM chip is shown in Figure 7-3. An m -bit address, having lines numbered from 0 to m -1 is applied to pins A - A m -1 , while asserting CS Chip Select, and either WR for writing data to the chip or WR for read- ing data from the chip. T he overbars on CS and WR indicate that the chip is selected when CS=0 and that a write operation will occur when WR=0. When reading data from the chip, after a time period t AA the time delay from when the address lines are made valid to the time the data is available at the output, the w -bit data word appears on the data lines D - D w -1 . When writing data to a chip, the data lines must also be held valid for a time period t AA . Notice that the data lines are bidirectional in Figure 7-3, which is normally the case. T he address lines A - A m -1 in the RAM chip shown in Figure 7-3 contain an address, which is decoded from an m -bit address into one of 2 m locations within the chip, each of which has a w -bit word associated with it. T he chip thus con- tains 2 m × w bits. A -A m-1 D -D w-1 WR CS Memory Chip Figure 7-3 Simplified RAM chip pinout CHAPT ER 7 MEMORY 259 Now consider the problem of creating a RAM that stores four four-bit words. A RAM can be thought of as a collection of registers. We can use four-bit registers to store the words, and then introduce an addressing mechanism that allows one of the words to be selected for reading or for writing. Figure 7-4 shows a design for the memory. Two address lines A and A 1 select a word for reading or writing via the 2-to-4 decoder. T he outputs of the registers can be safely tied together without risking an electrical short because the 2-to-4 decoder ensures that at most one register is enabled at a time, and the disabled registers are electrically disconnected through the use of tri-state buffers. T he Chip Select line in the decoder is not necessary, but will be used later in constructing larger RAMs. A simplified drawing of the RAM is shown in Figure 7-5 . T here are two common ways to organize the generalized RAM shown in Figure 7-3. In the smallest RAM chips it is practical to use a single decoder to select one D 3 D 2 D 1 D Q 3 Q 2 Q 1 Q WR CS Word 0 00 01 10 11 A A 1 WR WR CS Word 1 WR CS Word 2 WR CS Word 3 2-to-4 decoder Chip Select CS Figure 7-4 A four-word memory with four bits per word in a 2D organization. 260 CHAPT ER 7 MEMORY out of 2 m words, each of which is w bits wide. However, this organization is not economical in ordinary RAM chips. Consider that a 64M × 1 chip has 26 address lines 64M = 2 26 . T his means that a conventional decoder would need 2 26 26-input AND gates, which manifests itself as a large cost in terms of chip area – and this is just for the decode. Since most ICs are roughly square, an alternate decoding structure that signifi- cantly reduces the decoder complexity decodes the rows separately from the col- umns. T his is referred to as a 2-12D organization. T he 2-12D organization is by far the most prevalent organization for RAM ICs. Figure 7-6 shows a 2 6 -word × 1-bit RAM with a 2 12D organization. T he six address lines are evenly split between a row decoder and a column decoder the column decoder is actually a MUXDEMUX combination. A single bidirectional data line is used for input and output. During a read operation, an entire row is selected and fed into the column Q 3 Q 2 Q 1 Q A A 1 WR CS D 3 D 2 D 1 D 4 × 4 RAM Figure 7-5 A simplified version of the four-word by four-bit RAM. Row Dec- oder Column Decoder MUXDEMUX A A 1 A 2 A 3 A 4 A 5 Data One Stored Bit Q D CLK Read Row Select Column Select Data InOut ReadWrite Control Two bits wide: One bit for data and one bit for select. Figure 7-6 2-12D organization of a 64-word by one-bit RAM. CHAPT ER 7 MEMORY 261 MUX, which selects a single bit for output. During a write operation, the single bit to be written is distributed by the DEMUX to the target column, while the row decoder selects the proper row to be written. In practice, to reduce pin count, there are generally only m 2 address pins on the chip, and the row and column addresses are time-multiplexed on these m 2 address lines. First, the m 2-bit row address is applied along with a row address strobe, RAS, signal. T he row address is latched and decoded by the chip. T hen the m 2-bit column address is applied, along with a column address strobe, CAS. T here may be additional pins to control the chip refresh and other memory func- tions. Even with this 2-12D organization and splitting the address into row and col- umn components, there is still a great faninfanout demand on the decoder logic gates, and the still large number of address pins forces memory chips into large footprints on printed circuit boards PCBs. In order to reduce the faninfanout constraints, tree decoders may be used, which are discussed in Section 7.8.1. A newer memory architecture that serializes the address lines onto a single input pin is discussed in Section 7.9. Although DRAMs are very economical, SRAMs offer greater speed. T he refresh cycles, error detection circuitry, and the low operating powers of DRAMs create a speed difference that is roughly 14 of SRAM speed, but SRAMs also incur a significant cost. T he performance of both types of memory SRAM and DRAM can be improved. Normally a number of words constituting a block will be accessed in succession. In this situation, memory accesses can be interleaved so that while one memory is accessing address A m , other memories are accessing A m +1 , A m +2 , A m +3 etc . In this way the access time for each word can appear to be many times faster. 7.3.1 CONSTRUCTING LARGE RAMS FROM SMALL RAMS We can construct larger RAM modules from smaller RAM modules. Both the word size and the number of words per module can be increased. For example, eight 16M × 1-bit RAM modules can be combined to make a 16M × 8-bit RAM module, and 32 16M × 1-bit RAM modules can be combined to make a 64M × 8-bit RAM module. 262 CHAPT ER 7 MEMORY As a simple example, consider using the 4 word × 4-bit RAM chip shown in Fig- ure 7-5, as a building block to first make a 4-word × 8-bit module, and then an 8-word × 4-bit module. We would like to increase the width of the four-bit words and also increase the number of words. Consider first the problem of increasing the word width from four bits to eight. We can accomplish this by simply using two chips, tying their CS chip select lines together so they are both selected together, and juxtaposing their data lines, as shown in Figure 7-7. Consider now the problem of increasing the number of words from four to eight. Figure 7-8 shows a configuration that accomplishes this. T he eight words are dis- tributed over the two four-word RAMs. Address line A 2 is needed because there are now eight words to be addressed. A decoder for A 2 enables either the upper or lower memory module by using the CS lines, and then the remaining address lines A and A 1 are decoded within the enabled module. A combination of these two approaches can be used to scale both the word size and number of words to arbitrary sizes.

7.4 Commercial M emory M odules