Professor Mike Schulte Computer Architecture ECE 201

Lecture 11: Exceptions and Pipelining Basics Professor Mike Schulte Computer Architecture ECE 201 Exceptions

System

user program

Exception

Handler

Exception:

return from

exception

normal control flow: sequential, jumps, branches, calls, returns

° Exception = unprogrammed control transfer

system takes action to handle the exception

must record the address of the offending instruction
record any other information necessary to return afterwards

returns control to user

must save & restore user state

Two Types of Exceptions

° Interrupts

caused by external events
asynchronous to program execution
may be handled between instructions
simply suspend and resume user program

° Traps (or Exceptions)

caused by internal events

exceptional conditions (overflow)
invalid instruction
faults (non-resident page in memory)
internal hardware error

synchronous to program execution
condition must be remedied by the handler
instruction may be retried or simulated and program continued

or program may be aborted MIPS convention ° Exception means any unexpected change in control flow, without distinguishing internal or external; ° Use the term interrupt only when the event is externally caused.

Type of event From where? MIPS terminology

I/O device request External Interrupt Invoke OS from user program Internal Exception Arithmetic overflow Internal Exception Using an undefined instruction Internal Exception Hardware malfunctions Either Exception or Interrupt

Additions to MIPS ISA to support Exceptions?

° EPC –a 32-bit register used to hold the address of the affected instruction. ° Cause –a register used to record the cause of the exception. To simplify the discussion, assume

undefined instruction=0
arithmetic overflow=1

° Status - interrupt mask and enable bits and determines what exceptions can occur.

° Control signals to write EPC , Cause, and Status ° Be able to write exception address into PC, increase mux set PC to exception address (C000 0000 ). hex

° May have to undo PC = PC + 4, since want EPC to point to offending instruction (not its successor); PC = PC - 4 Big Picture: user / system modes

° By providing two modes of execution (user/system) it is possible for the computer to manage itself

operating system is a special program that runs in the priviledged mode and has access to all of the resources of the computer
presents “virtual resources” to each user that are more

convenient that the physical resurces files vs. disk sectors -

virtual memory vs physical memory

protects each user program from others

° Exceptions allow the system to taken action in response to events that occur while user program is executing

O/S begins at the handler

How Control Detects Exceptions in our FSD

° Undefined Instruction–detected when no next state is defined from state 1 for the op value.

We handle this exception by defining the next state value for all op • values other than lw, sw, 0 (R-type), jmp, beq, and ori as new state

° detect overflow, and a signal called Overflow is provided as an output from the ALU. This signal is used in the modified finite state machine to specify an additional possible next state

° Note: Challenge in designing control of a real machine is to handle different interactions between instructions and other exception-causing events such that control logic remains small and fast.

Complex interactions makes the control unit the most challenging • aspect of hardware design

Modification to the Control Specification

IR <= MEM[PC] undefined instruction

PC <= PC + 4 EPC <= PC - 4

A <= R[rs] PC <= exp_addr other

B <= R[rt] cause <= 0 (UI)

BEQ

LW R-type ORi SW

S <= A - B ~Equal

S <= A fun B S <= A op ZX S <= A + SX S <= A + SX 0010

Equal

overflow

PC <= PC + M <= MEM[S] MEM[S] <= B

SX || 00 0011

R[rd] <= S R[rt] <= S R[rt] <= M Additional condition from

EPC <= PC - 4 Datapath to indicate overflow

PC <= exp_addr cause <=1 (Ovf)

Pipelining is Natural!

° Pipelining provides a method for executing multiple A B C D instructions at the same time.

° Laundry Example ° Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold

° Washer takes 30 minutes ° Dryer takes 40 minutes ° “Folder” takes 20 minutes

Sequential Laundry

11 Midnight

10 Time 30 40 20 30 40 20 30 40 20 30 40 20

T a

A

s k

B

O r

C

d e r

D

° Sequential laundry takes 6 hours for 4 loads ° If they learned pipelining, how long would laundry take?

Pipelined Laundry: Start work ASAP

6 PM

11 Midnight

10 Time 30 40

40 40 40 20

T a

A

s k

B

O r

C

d e r

D

° Pipelined laundry takes 3.5 hours for 4 loads

Pipelining Lessons

° Pipelining doesn’t help latency of single task, it

6 PM

9 helps throughput of entire workload

Time

° Pipeline rate limited by slowest pipeline stage 30 40

40 40 40 20

° Multiple tasks operating

A simultaneously using

different resources

° Potential speedup =

B Number pipe stages

° Unbalanced lengths of

pipe stages reduces C

speedup

e r

° Time to “ fill ” pipeline and

D time to “ drain ” it reduces

speedup

° Stall for Dependences

The Five Stages of the Load Instruction

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined Execution

° Ifetch: Instruction Fetch

Fetch the instruction from the Instruction Memory

° Reg/Dec: Registers Fetch and Instruction Decode ° Exec: Calculate the memory address ° Mem: Read the data from the Data Memory ° Wr: Write the data back to the register file

° On a processor multple instructions are in various stages at the same time. ° Assume each instruction takes five cycles

IFetch Dcd Exec Mem WB

IFetch Dcd Exec Mem WB Program Flow Time

Single Cycle, Multiple Cycle, vs. Pipeline Clk Cycle 1 Multiple Cycle Implementation: Ifetch Reg Exec Mem Wr Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Load Ifetch Reg Exec Mem Wr Ifetch Reg Exec Mem Load Store Pipeline Implementation: Ifetch Reg Exec Mem Wr Store Clk Single Cycle Implementation: Load Store Waste Ifetch R-type Ifetch Reg Exec Mem Wr R-type Cycle 1 Cycle 2 Graphically Representing Pipelines

° Can help with answering questions like:

How many cycles does it take to execute this code?
What is the ALU doing during cycle 4?
Are two instructions trying to use the same resource at the

same time?

I n s t r.

Time (clock cycles)

Inst 0 Inst 1 ALU Im Reg Dm Reg ALU Im Reg Dm Reg

Why Pipeline? Because the resources are there!

Time (clock cycles)

ALU

I Im Reg Dm Reg n Inst 0

ALU

Im Reg Dm Reg

Inst 1 r. ALU Im Reg Dm Reg

Inst 2

ALU

Inst 3 Im Reg Dm Reg

e r

ALU Inst 4 Im Reg Dm Reg Why Pipeline?

° Suppose

100 instructions are executed
The single cycle machine has a cycle time of 45 ns

The multicycle and pipeline machines have cycle times of 10 ns

The multicycle machine has a CPI of 4.6 •

° Single Cycle Machine

45 ns/cycle x 1 CPI x 100 inst = 4500 ns

° Multicycle Machine 10 ns/cycle x 4.6 CPI x 100 inst = 4600 ns •

° Ideal pipelined machine

10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = 1040 ns

° Ideal pipelined vs. single cycle speedup

4500 ns / 1040 ns = 4.33

What has not yet been considered? °

Can pipelining get us into trouble?

° Yes: Pipeline Hazards

structural hazards : attempt to use the same resource two

different ways at the same time

E.g., two instructions try to read the same memory at the

same time

data hazards : attempt to use item before it is ready

instruction depends on result of prior instruction still in the pipeline add r1 , r2, r3 sub r4, r2, r1

control hazards : attempt to make a decision before condition is

evaulated

branch instructions

beq r1, loop add r1, r2, r3 ° Can always resolve hazards by waiting

pipeline control must detect the hazard take action (or delay action) to resolve hazards
Single Memory is a Structural Hazard

Time (clock cycles)

ALU Mem Mem Reg

I Reg n Load

ALU

Mem Mem Reg Reg

Instr 1 r. ALU Mem Mem Reg Reg

Instr 2 ALU

Mem Mem Reg Reg

Instr 3

ALU

Mem Mem Reg Reg Instr 4 Detection is easy in this case! (right half highlight means read, left half write)

Structural Hazards limit performance

° Example: if 1.3 memory accesses per instruction and only one memory access per cycle then average CPI = 1.3 •

otherwise resource is more than 100% utilized

° Solution 1: Use separate instruction and data memories

° Solution 2: Allow memory to read and write more than one word per cycle

° Solution 3: Stall

Control Hazard Solutions

° Stall: wait until decision is clear Its possible to move up decision to 2nd stage by adding • hardware to check registers as being read

I Time (clock cycles) n ALU

Mem Mem Reg

s Reg

Add

ALU r. Mem Mem Reg Reg Beq

ALU

d e r

Mem Reg Reg Load Mem

° Impact: 2 clock cycles per branch instruction => slow

Control Hazard Solutions

° Predict: guess one direction then back up if wrong

Predict not taken

° Impact: 1 clock cycle per branch instruction if right, 2 if wrong (right - 50% of time)

° More dynamic scheme: history of 1 branch (- 90%)

I n s t r. O r d e r

Time (clock cycles)

Add Beq Load ALU Mem Reg Mem Reg ALU Mem Reg Mem Reg Mem ALU Reg

Mem Reg

° Redefine branch behavior (takes place after next instruction) “delayed branch”

Control Hazard Solutions

° Impact: 1 clock cycles per branch instruction if can find instruction to put in “slot” (- 50% of time)

° Launch more instructions per clock cycle=>less useful

I n s t r. O r d e r

Time (clock cycles)

Add Beq Misc ALU Mem Reg Mem Reg ALU Mem Reg Mem Reg Mem ALU Reg

Mem Reg

Load Mem ALU Reg Mem Reg

Data Hazard on r1 add r1 ,r2,r3 sub r4, r1 ,r3 and r6, r1 ,r7 or r8, r1 ,r9 xor r10, r1 ,r11 Problem: r1 cannot be read by other instructions before it is written by the add

Data Hazard on r1:

Dependencies backwards in time are hazards

I n s t r. O r d e r

Time (clock cycles)

add r1 ,r2,r3 sub r4, r1 ,r3 and r6, r1 ,r7 or r8, r1 ,r9 xor r10, r1 ,r11

ID/RF EX MEM WB ALU Im Reg Dm

Reg

ALU Im Reg Dm Reg

ALU

Im Reg Dm Reg

Im ALU Reg Dm Reg ALU Im Reg Dm Reg

Data Hazard Solution:

“Forward” result from one stage to another
“or” OK if define read/write properly

I n s t r. O r d e r

Time (clock cycles)

add r1 ,r2,r3 sub r4, r1 ,r3 and r6, r1 ,r7 or r8, r1 ,r9 xor r10, r1 ,r11

ID/RF EX MEM WB ALU Im Reg Dm

Reg

ALU Im Reg Dm Reg

ALU

Im Reg Dm Reg

Im ALU Reg Dm Reg ALU Im Reg Dm Reg

Forwarding (or Bypassing): What about Loads

Dependencies backwards in time are hazards
Can’t solve with forwarding:
Must delay/stall instruction dependent on loads

Time (clock cycles)

lw r1 ,0(r2) sub r4, r1 ,r3

ID/RF EX MEM WB ALU Im Reg Dm

Reg

ALU Im Reg Dm Reg

Professor Mike Schulte Computer Architecture ECE 201

Lecture 11: Exceptions and Pipelining Basics Professor Mike Schulte Computer Architecture ECE 201 Exceptions

Two Types of Exceptions

Type of event From where? MIPS terminology

Additions to MIPS ISA to support Exceptions?

How Control Detects Exceptions in our FSD

Modification to the Control Specification

Pipelining is Natural!

Sequential Laundry

A

B

C

D

Pipelined Laundry: Start work ASAP

A

B

C

D

Pipelining Lessons

A simultaneously using

B Number pipe stages

D time to “ drain ” it reduces

The Five Stages of the Load Instruction

Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Ifetch Reg/Dec Exec Mem Wr Load Pipelined Execution

Why Pipeline? Because the resources are there!

Inst 1 r. ALU Im Reg Dm Reg

Inst 2

Inst 3 Im Reg Dm Reg

ALU Inst 4 Im Reg Dm Reg Why Pipeline?

Can pipelining get us into trouble?

Instr 1 r. ALU Mem Mem Reg Reg

Instr 2 ALU

Instr 3

Structural Hazards limit performance

Control Hazard Solutions

Add

ALU r. Mem Mem Reg Reg Beq

Mem Reg Reg Load Mem

Control Hazard Solutions

Add Beq Load ALU Mem Reg Mem Reg ALU Mem Reg Mem Reg Mem ALU Reg

Control Hazard Solutions

Add Beq Misc ALU Mem Reg Mem Reg ALU Mem Reg Mem Reg Mem ALU Reg

Data Hazard on r1 add r1 ,r2,r3 sub r4, r1 ,r3 and r6, r1 ,r7 or r8, r1 ,r9 xor r10, r1 ,r11 Problem: r1 cannot be read by other instructions before it is written by the add

Data Hazard on r1:

Data Hazard Solution:

Forwarding (or Bypassing): What about Loads

Dokumen yang terkait

Perancangan Enterprise Architecture Dengan Menggunakan TOGAF Pada PT Sejahtera Usaha Bersama

Key words : Prescribing Profile, Cardiovascular, Screening Recipe. PENDAHULUAN - PROFIL PERESEPAN OBAT PADA PASIEN RAWAT JALAN JAMKESDA DARI POLI KARDIOVASKULAR DI APOTEK RUMAH SAKIT LABUANG BAJI MAKASSAR PERIODE JANUARI – JUNI 201

Astract: Computer network technology develops very

PengaruhTata kelola perusahaan, Profitabilitas, Kebijakan Dividen, Intensitas penelitian dan Pengembangan terhadap Nilai Perusahaan Pada Perusahaan Manufaktur yang Tercatat di Bursa Efek Indonesia Periode Tahun 2009 – 201

Jill Gerhardt jill.gerhardtstockton.edu Computer Science and Information Systems Stockton College Galloway, New Jersey 08205, USA

Implementation of Computer Assisted Test Selection System in Local Governments

Professor Mike Schulte Computer Architecture ECE 201

Professor Mike Schulte Computer Architecture ECE 201

Professor Mike Schulte Computer Architecture ECE 201

Professor Mike Schulte Computer Architecture ECE 201

Dokumen yang Anda mencari sudah siap untuk unduhkan