From CISC to RISC

CHAPT ER 10 T RENDS IN COMPUT ER ARCHIT ECT URE 407 speedup as a direct percent can be represented as: We can develop a more fine-grained equation for estimating T if we have infor- mation about the machine’s clock period, τ , the number of clock cycles per instruction, CPI , and a count of the number of instructions executed by the pro- gram during its execution, IC . In this case the total execution time for the pro- gram is given by: CPI and IC can be expressed either as an average over the instruction set and total count, respectively, or summed over each kind and number of instructions in the instruction set and program. Substituting the latter equation into the former we get: T hese equations and others derived from them, are useful in computing and esti- mating the impact of changes in instructions and architecture upon perfor- mance. EXAMPLE: CALCULAT ING SPEEDUP FOR A NEW INST RUCT ION SET Suppose we wish to estimate the speedup obtained by replacing a CPU having an average CPI of 5 with another CPU having an average CPI of 3.5, with the clock period increased from 100 ns to 120 ns. T he equation above becomes: T hus, without actually running a benchmark program we can estimate the impact of an architectural change upon performance. ■

10.2 From CISC to RISC

Historically, when memory cycle times were very long and when memory prices S T wo T w – T w ---------------------- 100 × = T IC CPI τ × × = S IC wo CPI wo τ wo × × IC w CPI w τ w × × – IC w CPI w τ w × × ------------------------------------------------------------------------------------------------- 100 × = S 5 100 × 3.5 120 × – 3.5 120 × ----------------------------------------------- 100 × = 19 = 408 CHAPT ER 10 T RENDS IN COMPUT ER ARCHIT ECT URE were high, fewer, complicated instructions held an advantage over more, simpler instructions. T here came a point, however, when memory became inexpensive enough and memory hierarchies became fast and large enough, that computer architects began reexamining this advantage. One technology that affected this examination was pipelining —that is, keeping the execution unit more or less the same, but allowing different instructions which each require several clock cycles to execute to use different parts of the execution unit on each clock cycle. For example, one instruction might be accessing operands in the register file while another is using the ALU. We will cover pipelining in more detail later in the chapter, but the important point to make here is that computer architects learned that CISC instructions do not fit pipelined architectures very well. For pipelining to work effectively, each instruction needs to have similarities to other instructions, at least in terms of rel- ative instruction complexity. T he reason can be viewed in analogy to an assembly line that produces different models of an automobile. For efficiency, each “sta- tion” of the assembly line should do approximately the same amount and kind of work. If the amount or kind of work done at each station is radically different for different models, then periodically the assembly line will have to “stall” to accommodate the requirements of the given model. CISC instruction sets have the disadvantage that some instructions, such as regis- ter-to-register moves, are inherently simple, whereas others, such as the MVC instruction and others like it are complex, and take many more clock cycles to execute. T he main philosophical underpinnings of the RISC approach are: • Prefetch instructions into an instruction queue in the CPU before they are needed. T his has the effect of hiding the latency associated with the instruc- tion fetch. • With instruction fetch times no longer a penalty, and with cheap memory to hold a greater number of instructions, there is no real advantage to CISC instructions. All instructions should be composed of sequences of RISC in- structions, even though the number of instructions needed may increase typically by as much as 13 over a CISC approach. • Moving operands between registers and memory is expensive, and should be minimized. CHAPT ER 10 T RENDS IN COMPUT ER ARCHIT ECT URE 409 • T he RISC instruction set should be designed with pipelined architectures in mind. • T here is no requirement that CISC instructions be maintained as integrat- ed wholes; they can be decomposed into sequences of simpler RISC in- structions. T he result is that RISC architectures have characteristics that distinguish them from CISC architectures: • All instructions are of fixed length, one machine word in size. • All instructions perform simple operations that can be issued into the pipe- line at a rate of one per clock cycle. Complex operations are now composed of simple instructions by the compiler. • All operands must be in registers before being operated upon. T here is a separate class of memory access instructions: LOAD and ST ORE. T his is referred to as a LOAD-ST ORE architecture. • Addressing modes are limited to simple ones. Complex addressing calcula- tions are built up using sequences of simple operations. • T here should be a large number of general registers for arithmetic opera- tions so that temporary variables can be stored in registers rather than on a stack in memory. In the next few sections, we explore additional motivations for RISC architec- tures, and special characteristics that make RISC architectures effective.

10.3 Pipelining the Datapath