12 Processor Structure and Function [Compatibility Mode]

Willia m St a llings Com put e r Orga nizat ion a nd Archit e c t ure

t h

8 Edit ion Cha pt e r 1 2 Proc e ssor St ruc t ure a nd Func t ion

— Fet ch inst ruct ions

— I nt erpret inst ruct ions — Fet ch dat a — Process dat a — Writ e dat a

CPU St ruc t ure

CPU m ust :

s u B s m te s

re tu c u tr S

Re gist e rs

CPU m ust have som e w orking space • ( t em porary st orage) Called regist ers • Num ber and funct ion vary bet w een • processor designs One of t he m aj or design decisions • Top level of m em ory hierarchy •

U se r V isible Re gist e rs

General Purpose • Dat a
Address • Condit ion Codes

Ge ne ra l Pur pose Re gist e rs (1 )

May be t rue general purpose
May be rest rict ed
• May be used for dat a or addressing
Dat a

— Accum ulat or

Addressing

— Segm ent

Ge ne ra l Pur pose Re gist e rs (2 )

Make t hem general purpose

— I ncrease flexibilit y and program m er opt ions — I ncrease inst ruct ion size & com plexit y

Make t hem specialized

— Sm aller ( fast er) inst ruct ions — Less flexibilit y

H ow M a ny GP Re gist e rs?

Bet w een 8 - 32 • Fewer = m ore m em ory references • More does not reduce m em ory references • and t akes up processor real est at e See also RI SC •

H ow big?

Large enough t o hold full address • Large enough t o hold full word • Oft en possible t o com bine t wo dat a • regist ers

— C program m ing — double int a; — long int a;

Condit ion Code Re gist e rs

Set s of individual bit s

— e.g. result of last operat ion was zero

• Can be read ( im plicit ly) by program s

— e.g. Jum p if zero

• Can not ( usually) be set by program s

Cont rol & St at us Re gist e rs

Program Count er
I nst ruct ion Decoding Regist er
Mem ory Address Regist er
Mem ory Buffer Regist er
• Revision: what do t hese all do?

Progra m St at us Word

A set of bit s
• I ncludes Condit ion Codes
• Sign of last result
Zero • Carry • Equal • Overflow • I nt errupt enable/ disable
Supervisor

Supe r visor M ode

I nt el ring zero
Kernel m ode
• Allows privileged inst ruct ions t o execut e
Used by operat ing syst em
Not available t o user program s

Ot he r Re gist e rs

May have regist ers point ing t o: • — Process cont rol blocks ( see O/ S) — I nt errupt Vect ors ( see O/ S)

N.B. CPU design and operat ing syst em • design are closely linked

s n o ti a iz n a rg O r te is

I nst ruc t ion Cycle

Revision
• St allings Chapt er 3

I ndire c t Cycle

May require m em ory access t o fet ch • operands I ndirect addressing requires m ore • m em ory accesses Can be t hought of as addit ional inst ruct ion • subcycle

t c e ir d n

I h it w le c y

m ra g ia D te ta S le c y

— PC cont ains address of next inst ruct ion — Address m oved t o MAR — Address placed on address bus — Cont rol unit request s m em ory read — Result placed on dat a bus, copied t o MBR, t hen t o I R

Dat a Flow (I nst ruc t ion Fe t ch)

Depends on CPU design
I n general:
Fet ch

— Meanwhile PC increm ent ed by 1 perform ed — Right m ost N bit s of MBR t ransferred t o MAR — Cont rol unit request s m em ory read — Result ( address of operand) m oved t o MBR

Dat a Flow (Dat a Fe t ch)

I R is exam ined
I f indirect addressing, indirect cycle is

) m ra g ia D h tc e

) m ra g ia D t c e ir d

— Mem ory read/ w rit e — I nput / Out put — Regist er t ransfers — ALU operat ions

Dat a Flow (Exe c ut e )

May t ake m any form s
• Depends on inst ruct ion being execut ed
May include

Dat a Flow (I nt e rrupt )

Sim ple

Predict able • Current PC saved t o allow resum pt ion • aft er int errupt Cont ent s of PC copied t o MBR • Special m em ory locat ion ( e.g. st ack • point er) loaded t o MAR MBR writ t en t o m em ory • PC loaded wit h address of int errupt • handling rout ine Next inst ruct ion ( first of int errupt handler) • can be fet ched

) m ra g ia D t p ru r te

Pre fe t ch

Fet ch accessing m ain m em ory • Execut ion usually does not access m ain • m em ory Can fet ch next inst ruct ion during • execut ion of current inst ruct ion Called inst ruct ion prefet ch •

I m prove d Pe rfor m a nc e

But not doubled:

— Fet ch usually short er t han execut ion

– Prefet ch m ore t han one inst ruct ion?

— Any j um p or branch m eans t hat prefet ched inst ruct ions are not t he required inst ruct ions

• Add m ore st ages t o im prove perform ance

Pipe lining

Fet ch inst ruct ion
Decode inst ruct ion
• Calculat e operands ( i.e. EAs)
Fet ch operands
Execut e inst ruct ions
Writ e result
Overlap t hese operat ions

e n li e ip P n o ti c u tr s

T im ing Dia gra m for I nst ruc t ion Pipe line Ope rat ion

T he Effe c t of a Condit iona l Bra nch on I nst ruc t ion Pipe line Ope rat ion

e n li e ip

n o ti ic p e D e n li e ip

to rs o n

Pipe line H a za rds

Pipeline, or som e port ion of pipeline, m ust • st all Also called pipeline bubble • Types of hazards •

— Resource — Dat a — Cont rol

Re sourc e H a za rds

Tw o ( or m ore) inst ruct ions in pipeline need sam e resource
Execut ed in serial rat her t han parallel for part of pipeline
Also called st ruct ural hazard
E.g. Assum e sim plified five- st age pipeline _{— Each st age t akes one clock cycle}
I deal case is new inst ruct ion ent ers pipeline each clock cycle
Assum e m ain m em ory has single port
• Assum e inst ruct ion fet ches and dat a reads and writ es perform ed

one at a t im e

I gnore t he cache
Operand read or w rit e cannot be perform ed in parallel w it h

inst ruct ion fet ch

Fet ch inst ruct ion st age m ust idle for one cycle fet ching I 3
• E.g. m ult iple inst ruct ions ready t o ent er execut e inst ruct ion phase
Single ALU
One solut ion: increase available resources _{— Mult iple m ain m em ory port s}

Dat a H a za rds

Conflict in access of an operand locat ion • Tw o inst ruct ions t o be execut ed in sequence • Bot h access a part icular m em ory or regist er operand • I f in st rict sequence, no problem occurs • I f in a pipeline, operand value could be updat ed so as t o •

produce different result from st rict sequent ial execut ion

E.g. x86 m achine inst ruct ion sequence: •

ADD EAX, EBX / * EAX = EAX + EBX

SUB ECX, EAX / * ECX = ECX – EAX • ADD inst ruct ion does not updat e EAX unt il end of st age 5, • at clock cycle 5 SUB inst ruct ion needs value at beginning of it s st age 2, at • clock cycle 4 Pipeline m ust st all for t wo clocks cycles • Wit hout special hardware and specific avoidance • algorit hm s, result s in inefficient pipeline usage

m ra g ia

Type s of Dat a H a za rd

Read aft er w rit e ( RAW) , or t rue dependency

— An inst ruct ion m odifies a regist er or m em ory locat ion

— Succeeding inst ruct ion reads dat a in t hat locat ion — Hazard if read t akes place before writ e com plet e

Writ e aft er read ( RAW) , or ant idependency

— An inst ruct ion reads a regist er or m em ory locat ion — Succeeding inst ruct ion writ es t o locat ion — Hazard if writ e com plet es before read t akes place

Writ e aft er writ e ( RAW) , or out put dependency

— Tw o inst ruct ions bot h writ e t o sam e locat ion — Hazard if writ es t ake place in reverse of order int ended sequence

Previous exam ple is RAW hazard

m ra g ia D rd a z

Cont rol H a za rd

Also known as branch hazard
• Pipeline m akes wrong decision on branch

predict ion

• Brings inst ruct ions int o pipeline t hat m ust

subsequent ly be discarded

Dealing wit h Branches

— Mult iple St ream s — Prefet ch Branch Target — Loop buffer — Branch predict ion — Delayed branching

M ult iple St re a m s

Have t wo pipelines • Prefet ch each branch int o a separat e • pipeline Use appropriat e pipeline • Leads t o bus & regist er cont ent ion • Mult iple branches lead t o furt her pipelines • being needed

Pre fe t ch Bra nch Ta rge t

Target of branch is prefet ched in addit ion • t o inst ruct ions following branch Keep t arget unt il branch is execut ed • Used by I BM 360/ 91 •

Loop Buffe r

Very fast m em ory
• Maint ained by fet ch st age of pipeline
Check buffer before fet ching from

m em ory

Very good for sm all loops or j um ps
c.f. cache
Used by CRAY- 1

m ra g ia

Bra nch Pre dic t ion (1 )

Predict never t aken

— Assum e t hat j um p will not happen — Always fet ch next inst ruct ion — 68020 & VAX 11/ 780 —

VAX will not prefet ch aft er branch if a page fault would result ( O/ S v CPU design)

Predict always t aken

— Assum e t hat j um p will happen — Always fet ch t arget inst ruct ion

Bra nch Pre dic t ion (2 )

Predict by Opcode

— Som e inst ruct ions are m ore likely t o result in a j um p t han t hers

— Can get up t o 75% success

Taken/ Not t aken swit ch

— Based on previous hist ory — Good for loops — Refined by t wo- level or correlat ion- based branch hist ory

Correlat ion- based

— I n loop- closing branches, hist ory is good predict or

— I n m ore com plex st ruct ures, branch direct ion

— Do not t ake j um p unt il you have t o — Rearrange inst ruct ions

Bra nch Pre dic t ion (3 )

Delayed Branch

t r a h c w lo F n o ti ic

m ra g ia D te ta S n o ti ic

I nt e l 8 0 4 8 6 Pipe lining

Fet ch

— From cache or ext ernal m em ory — Put in one of t w o 16- byt e prefet ch buffers — Fill buffer w it h new dat a as soon as old dat a consum ed — Average 5 inst ruct ions fet ched per load — I ndependent of ot her st ages t o keep buffers full

Decode st age 1

— Opcode & address- m ode info — At m ost first 3 byt es of inst ruct ion — Can direct D2 st age t o get rest of inst ruct ion

Decode st age 2

— Expand opcode int o cont rol signals — Com put at ion of com plex address m odes

Execut e

— ALU operat ions, cache access, regist er updat e

Writ eback

s le p m a x E e n li e ip P n o ti c

rs te is g

r te is

rs te

M M X Re gist e r M a pping

MMX uses several 64 bit dat a t ypes
Use 3 bit regist er address fields

— 8 regist ers

No MMX specific regist ers

— Aliasing t o low er 64 bit s of exist ing float ing point regist ers

M a pping of M M X Re gist e rs t o Float ing-Point Re gist e rs

Pe nt ium I nt e rrupt Proc e ssing

I nt errupt s

— Maskable — Nonm askable

Except ions

— Processor det ect ed — Program m ed

I nt errupt vect or t able

— Each int errupt t ype assigned a num ber — I ndex t o vect or t able — 256 * 32 bit int errupt vect ors

5 priorit y classes

ARM At t ribut e s

RI SC
Moderat e array of uniform regist ers

— More t han m ost CI SC, less t han m any RI SC

Load/ st ore m odel

— Operat ions perform on operands in regist ers only

Uniform fixed- lengt h inst ruct ion

— 32 bit s st andard set 16 bit s Thum b

Shift or rot at ion can preprocess source regist ers

— Separat e ALU and shift er unit s

Sm all num ber of addressing m odes

— All load/ st ore addressees from regist ers and inst ruct ion fields

— No indirect or indexed addressing involving values in

m em ory

Aut o- increm ent and aut o- decrem ent addressing

— I m prove loops

n o ti a iz n a rg O M

ARM Proc e ssor Orga nizat ion

Many variat ions depending on ARM version • Dat a exchanged bet ween processor and m em ory • t hrough dat a bus Dat a it em ( load/ st ore) or inst ruct ion ( fet ch) • I nst ruct ions go t hrough decoder before execut ion •

Pipeline and cont rol signal generat ion in cont rol

unit Dat a goes t o regist er file • — Set of 32 bit regist ers — Byt e & halfword t wos com plem ent dat a sign ext ended

Typically t wo source and one result regist er • Rot at ion or shift before ALU •

ARM Proc e ssor M ode s

User • Privileged

— 6 m odes

– OS can t ailor syst em s soft ware use
– Som e regist ers dedicat ed t o each privileged m ode
– Sw ift er cont ext changes

Except ion

— 5 of privileged m odes — Ent ered on given except ions — Subst it ut e som e regist ers for user regist ers

– Avoid corrupt ion

Privile ge d M ode s

Syst em Mode

— Not except ion — Uses sam e regist ers as User m ode — Can be int errupt ed by…

Supervisor m ode

— OS — Soft ware int errupt usedd t o invoke operat ing syst em services

Abort m ode

— m em ory fault s

Undefined m ode

— At t em pt inst ruct ion t hat is not support ed by int eger core

coprocessors

Fast int errupt m ode

— I nt errupt signal from designat ed fast int errupt source — Fast int errupt cannot be int errupt ed — May int errupt norm al int errupt

I nt errupt m ode

ARM Re gist e r Orga nizat ion Ta ble Modes Privileged modes Exception modes User System Supervisor Abort Undefined Interrupt Fast Interrupt _{R0 R0 R0 R0 R0 R0 R0} _{R1 R1 R1 R1 R1 R1 R1} R2 R2 R2 R2 R2 R2 R2 _{R3 R3 R3 R3 R3 R3 R3} _{R4 R4 R4 R4 R4 R4 R4} R5 R5 R5 R5 R5 R5 R5 _{R6 R6 R6 R6 R6 R6 R6} _{R7 R7 R7 R7 R7 R7 R7} R8 R8 R8 R8 R8 R8 R8_fiq _{R9 R9 R9 R9 R9 R9 R9_fiq} _{R10 R10 R10 R10 R10 R10 R10_fiq} R11 R11 R11 R11 R11 R11 R11_fiq _{R12 R12 R12 R12 R12 R12 R12_fiq} _{R13 (SP) R13 (SP) R13_svc R13_abt R13_und R13_irq R13_fiq} R14 (LR) R14 (LR) R14_svc R14_abt R14_und R14_irq R14_fiq _{R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC)}

ARM Re gist e r Orga nizat ion

37 x 32- bit regist ers
31 general- purpose regist ers

— Som e have special purposes — E.g. program count ers

Six program st at us regist ers
• Regist ers in part ially overlapping banks

— Processor m ode det erm ines bank

• 16 num bered regist ers and one or t wo

program st at us regist ers visible

Ge ne ra l Re gist e r U sa ge

R13 norm ally st ack point er ( SP)

— Each except ion m ode has it s own R13

R14 link regist er ( LR)

— Subrout ine and except ion m ode ret urn address

R15 program count er

CPSR

CPSR process st at us regist er

— Except ion m odes have dedicat ed SPSR

16 m sb are user flags

— Condit ion codes ( N,Z,C,V) — Q – overflow or sat urat ion in som e SMI D inst ruct ions

— J – Jazelle ( 8 bit ) inst ruct ions — GEE[ 3: 0] SMI D use [ 19: 16] as great er t han or equal flag

16 lsb syst em flags for privilege m odes

— E – endian — I nt errupt disable — T – Norm al or Thum b inst ruct ion

R S P S d n

ARM I nt e rrupt (Exc e pt ion) Proc e ssing

More t han one except ion allowed • Seven t ypes • Execut ion forced from except ion vect ors • Mult iple except ions handled in priorit y • order

Processor halt s execut ion aft er current

inst ruct ion Processor st at e preserved in SPSR for • except ion

— Address of inst ruct ion about t o execut e put in link regist er

Fore ground Re a ding

Processor exam ples
St allings Chapt er 12
• Manufact urer web sit es & specs

12 Processor Structure and Function [Compatibility Mode]

Dokumen yang terkait

In silico study of curcumol, curcumenol, isocurcumenol, and β-sitosterol as potential inhibitors of estrogen receptor alpha of breast cancer

Effect of hypoxia-inducible factor-1α induction by CoCl 2 on breast cancer cells survival: influence of cytochrome-c and survivin

Cytotoxic effect of γ-sitosterol from Kejibeling ( Strobilanthes crispus ) and its mechanism of action towards c-myc gene expression and apoptotic pathway

Identification of pathogenesis pathway in basal-like breast cancer based on mutant p53 protein and topoisomerase-IIα expression

Sejarah dan Fungsi Hukum Kedokteran (The History and Function of Medical Law)

Supply and Demand: How Markets Work

Peranap Coal Field Development Plan and Its Effect on Regional Development and Economic Growth of Indragiri Hulu Regency

Muslim World and the Globalisation Challenge: [En]Countering Dehumanization or Humanised Integration1

A Comparative Analysis of The Conceptions of Muhammad Naquib Al-Attas and Ismail Raji Al-Faruqi In Islamization of Knowledge

Managing the quality of the software process and products

Dukungan

Links

12 Processor Structure and Function [Compatibility Mode]

Dokumen yang terkait

In silico study of curcumol, curcumenol, isocurcumenol, and β-sitosterol as potential inhibitors of estrogen receptor alpha of breast cancer

Effect of hypoxia-inducible factor-1α induction by CoCl 2 on breast cancer cells survival: influence of cytochrome-c and survivin

Cytotoxic effect of γ-sitosterol from Kejibeling ( Strobilanthes crispus ) and its mechanism of action towards c-myc gene expression and apoptotic pathway

Identification of pathogenesis pathway in basal-like breast cancer based on mutant p53 protein and topoisomerase-IIα expression

Sejarah dan Fungsi Hukum Kedokteran (The History and Function of Medical Law)

Supply and Demand: How Markets Work

Peranap Coal Field Development Plan and Its Effect on Regional Development and Economic Growth of Indragiri Hulu Regency

Muslim World and the Globalisation Challenge: [En]Countering Dehumanization or Humanised Integration1

A Comparative Analysis of The Conceptions of Muhammad Naquib Al-Attas and Ismail Raji Al-Faruqi In Islamization of Knowledge

Managing the quality of the software process and products

Dokumen yang Anda mencari sudah siap untuk unduhkan