Edward Chang Electrical Computer Engineering
ECE154 ECE154 Computer Architecture Computer Architecture
Lecture #1 Lecture #1
Edward Chang Electrical & Computer Engineering ECE154 Winter 2003 1/6/2003
1 ECE154 Team
ECE154 Team l Instructor: Edward Chang l TAs
- – Chao, Chia-Tso – Gadepalli, S.
l Office Hours and Contact Information
- – Check class Web site frequently
- – http://www.mmdb.ece.ucsb.edu/~ece154/
Class Email Address l ECE154 Winter 2003 – ece154@mmdb.ece.ucsb.edu 2
- – Basic digital logic
- – Assembly programming
- – Simple OS concepts
1/6/2003 ECE154 Winter 2003 3 Textbook and Prerequisite
Textbook and Prerequisite l “Computer Architectures: A Quantitative
Approach’’, 3 rd edition, J. Hennessy and D.
Patterson l Prerequisite
- – HW: 20%; Midterms: 30%; Final: 50% >– All deadlines are final, no exceptions
- – Regrading requests must be in writing
ECE154 Winter 2003 4 Assignments and Exams
Assignments and Exams l
6 Homework Assignments
l Midterm on 2/10 l Final on 3/12 l Grading
l Policies
- – 8 – 9am 387 104
- – 12 – 1pm SNDCR 1637
- – 1 – 2pm 387 103
- – 2 – 3pm PHELP 1445
1/6/2003 ECE154 Winter 2003 5 Discussion Session Signup
Discussion Session Signup l Four Sessions (Friday)
ECE154 Winter 2003 6 Course Outline
Course Outline
- – Design process, constraints, tools
- – Execution Time vs. Throughput – CPU time equation
- – Amdhal’s law
1/6/2003 ECE154 Winter 2003 7 Today
Today ’
’ s Outline s Outline l Class Goals l What is Computer Architecture
l Performance
l Basic Design Principles l Chip Cost
- – Instruction-set architecture
- – Hardware and software techniques for instruction-level parallelism
- – Memory hierarchy and IO storage
- – Multiprocessor and multimedia processors
ECE154 Winter 2003 8 Class Goals
Class Goals l
Understand how computer systems are organized and why they are organized that way l
Be conversant with techniques for analyzing performance
and comparing systems lIntroduction to computer implementation techniques l
Will discuss advanced topics Components of a General Purpose - Purpose - Components of a General
Computer System Computer System
Programmable Processor
l
- – CPU, DSP, microprocessor
Memory
l
- – For program and data
- – Cache, DRAM, MEMS
Storage
l
Buses and Controllers
l
- – Connecting CPU, memory, storage ECE154 Winter 2003
- – Connecting networks 1/6/2003 9 Computer Architecture?
- – Mainframes and minicomputers
- – Emergence of microprocessor
- – C and compilers
- – Unix – RISC ECE154 Winter 2003 1/6/2003
- – Base circuit design performance: X – Architectural innovation: 15 X
- – Innovative architecture ideas
- – Measured by a quantitative approach ECE154 Winter 2003 1/6/2003
- – Examples: PCs, workstations
- – Metrics: latency (graphics & IO)
- – Examples: Web, database servers
- – Metrics: throughput, reliability, scalability
- – Examples: PDAs, cell phones, set-top boxes ECE154 Winter 2003
- – Metrics: complexity, power, latency 14
- – 35% increase in density, 55% a chip
- – 15% speedup
- – 60% increase in density
- – 7% reduction in latency
- – 14% increase in throughput
- – 100% increase in density
- – Access time improvement 33% in ten years
- – Little improvement in latency ECE154 Winter 2003
- – Large improvement in bandwidth 1/6/2003 15 Technology Trends
- – Shrinks from 10 microns (1971) to 0.18 (2001)
- – Shrinks quadratically with decrease in feature size
- – Improves quadratically w.r.t. 1/f
- – Improves linearly w.r.t. 1/f ECE154 Winter 2003
- – No improvement
- – Pentium IV allocates 2 out of 20-stage pipeline for wire delay
- – Switching frequency * load capacitance * voltage
- – The 1 microprocessor consumes 1/10 watt
- – 2GHz Pentium-4 consumes 100 watts ECE154 Winter 2003 1/6/2003
- – Yield – Competition
- – Memory: Price = cost
- – CPU: Price = (1+k%) cost
- – Dominated by processor,
- – Monitor ECE154 Winter 2003 1/6/2003
- – X runs N-times faster than Y – X takes 1/N time, compared to Y, to complete the task
– X takes 1/N+1 time, compared to Y, to complete the task
- – X responds N-times faster than Y
- – Decrease its running time by N folds
- – Increase its throughput by N folds
- – User CPU time and system CPU time
- – The choice of CAQA to measure processor performance (throughout Chapter 4)
- – But none can be perpetually and universally applicable 22 ECE154 Winter 2003
- – if P1 and P2 are run equally in the workload
- – Σ
- – w
- – The performance gain is limited by the
fraction of time the faster mode can be used
- – Meal time = Order time + Eating time
- – 40 mins = 20 mins + 20 mins
- – Speeding up eating alone cannot reduce the ECE154 Winter 2003
- – 1 / (0.6 + (0.4/10))
- – 1.56 ECE154 Winter 2003
- – Design to last through trends
- – Amdahl's law
- – CUP time formula
- – Make the common case fast
- – Locality – Parallelism
Computer Architecture? ECE154 Winter 2003
10 History of Computer Systems History of Computer Systems l 1960s – 1970s
Late 1970 l
Late 1980 l
11 Relative Performance
Relative Performance ECE154 Winter 2003
12
The Contribution of Architectural
The Contribution of Architectural
Innovation Innovation l 2001 Statistics
l ECE154 is about
13 Computer Categories
Computer Categories
l Desktopsl Servers
l Embedded Systems
Technology Trends
Technology Trends
(Yearly Improvement) (Yearly Improvement) Integrated Circuit: logic l
Integrated Circuit: memory l
Magnetic Disks l
Networking l
Technology Trends
(Yearly Improvement) (Yearly Improvement) l Feature Size (f)Device Size l
Transistor Count l
Transistor Performance l
16 Performance Bottlenecks Performance Bottlenecks l Wires
l Power
2
st
17 Changing Technology Changes
Changing Technology Changes
Architecture Architecture ECE154 Winter 200318 Cost, Price, and Trends Cost, Price, and Trends
Factors Affect Cost
l
Price
l
Computer Price
l
19 Relative Performance
Relative Performance ECE154 Winter 2003
20
Measuring Performance
Measuring Performance
l X performs N time better than Yl Improve a system performance by N folds
CAQA’s position l ECE154 Winter 2003 – Use Execution Time to measure performance 1/6/2003 21 Execution Time
Execution Time
l Elapsed Time l Response Time l CPU TimeMany Benchmarks l
1/6/2003 ECE154 Winter 2003 23 Benchmark Examples
Benchmark Examples
ECE154 Winter 2003 24 Performance Summary
Performance Summary l A is 10 times faster than B for program P1 l B is 10 times faster than A for program P2 l A is 10 times faster than C for program P1 l C is 10 times faster than A for program P2 l B is 10 times faster than C for program P1 l
C is 10 times faster than B for program P2
1/6/2003 ECE154 Winter 2003 25 Execution Time
Execution Time Total Time 1001 40 110 P2 1000 20 100
20
10
Computer C Computer
B
Computer A
1 P1
ECE154 Winter 2003 26 Arithmetic Means
Arithmetic Means l B is 9.1 faster than A for P1 + P2 l C is 25 times faster than A for P1 + P2 l C is 2.75 times faster than B for P1 + P2 l
Arithmetic mean is fine
1/6/2003 ECE154 Winter 2003 27 Weighted Execution Time
Weighted Execution Time l Weighted Arithmetic Mean
w i
× t i
i : frequency of execution of the i th program
ECE154 Winter 2003 28 Weights
Weights .5 .091 .001 P2 1000 20 100 .5 .909 .999
20
10
W(1) on C W(2) on B
W(3) on A C B A Weightings Programs
1 P1
1/6/2003 ECE154 Winter 2003 29 Weighted Arithmetic Means
Weighted Arithmetic Means
1 P1
20 55 500.5 W(1)
20
18.19
20
10.09
2 W(3) P2 1000 20 100
20
10
C B A
91.91 W(2)
ECE154 Winter 2003 30 Geometric Means
Geometric Means l The Weighted Arithmetic Mean differs depending on which machine is the reference one. l Geometric Mean
1/6/2003 ECE154 Winter 2003 31 Geometric Mean
0.02
10.01
5.05
1.0 Arm
1.0
5.0
50.0
0.2
1.0
10.0
0.1
1.0
1.0 P2 1.0 0.5 0.005
2.0
1.0
0.1
20.0
10.0
1.0 P1
C B A C B A C B A
Normalized to C Normalized to B Normalized to A
ECE154 Winter 2003 32 Summary of Means
5.05
1.1
Geometric Mean
1.0
1.0
2.75
25.03
0.36
1.0
9.1
0.04
0.11
1.0 TT
1.58
25.03
1.58
0.63
1.0
1.0
0.63
1.0
1.0 Geo
1.0
2.75
Summary of Means
Design Principles Design Principles
Make the Common Case Fast l
Principle of Locality l
Parallelism l ECE154 Winter 2003 1/6/2003
33 Make the Common Case Fast
Make the Common Case Fast Amdahl’s Law l
Example l
meal time to be under 20 mins 34
Make the Common Case Fast Make the Common Case Fast ECE154 Winter 2003 1/6/2003
35 Examples
Examples
IO time is 60% of the execution time, and l
CPU time 40% Speeding up CPU 10 times achieves l overall speedup of
36
1/6/2003 ECE154 Winter 2003 37 CPU Performance Equation
CPU Performance Equation
ECE154 Winter 2003 38 Computer Architecture?
Computer Architecture?
1/6/2003 ECE154 Winter 2003 39 Summary
Summary l
Computer Architecture
l Performance Metrics
l Design Principles
ECE154 Winter 2003 40 References
References l
Textbook figures, publisher l
Lecture notes, prof. Kozyrakis, Stanford