Lecture_1.ppt 2323KB Jun 23 2011 12:12:12 PM

CS 352H: Computer Systems Architecture

Lecture 1: What is Computer
Architecture and why should I care?

Professor Emmett Witchel
University of Texas at Austin
[email protected]

Lecture 1

1

Goals
• Understand the “how” and “why” of computer system
organization






Instruction Set Architecture
System Organization (processor, memory, I/O)
Microarchitecture
Virtualization

• Learn methods of evaluating performance
– Metrics & benchmarks

• Learn how to make systems go fast
– Pipelining, caching
– Parallelism (ILP, DLP, TLP)
– Application specific architectures (graphics, signal proc.)

• Preview of where architecture is heading
Lecture 1

2

Logistics
Lectures

Instructor
TA

T/Th 12:30-2:00pm, PAI 3.14
Prof. Emmett Witchel, W 1:15-2:15
Shalini Sahoo
MW 11:30-1:00pm PAI 5.38 Desk1

Grading

see web page

Texts

Hennessy & Patterson, Computer
Organization and Design (Fourth Edition)
Including CD
Revised Fourth Edition preferred, not required

Lecture 1


3

CS352H Online
URL: www.cs.utexas.edu/users/witchel/CS352H
I will occasionally email you via blackboard and by your
registered email address. I expect this channel to be
reliable and timely.
discussion group: via blackboard
login at courses.utexas.edu
General, Homeworks, Project
Computer Architecture Seminar Series:
www.cs.utexas.edu/users/cart/arch

Lecture 1

4

Assignment for Next Tuesday
• Turn in student survey forms, if you want

• Read the Moore paper (see webpage)
– Write a review of 1/2-1 page (see syllabus)
– Review should include
• Summary of content of paper
• Your observations on the most interesting/important
aspects
• Your observations on its relevance today
– Be prepared to discuss on Tuesday in class

Lecture 1

5

Discussion
• Are you interested in taking this course?
• One question about computer science
• One question about computer architecture

CS352H
Fall 2007


Lecture 1

6

Specification

compute the fibonacci sequence
for(i=2; i100x more devices since 1989
10x faster devices
Lecture 1

11

Changing Technology leads to
Changing Architecture


• 1990s


1970s
– multi-chip CPUs
– semiconductor memory very
expensive
– microcoded control
– complex instruction sets
(good code density)



– lots of transistors
– complex control to exploit
instruction-level parallelism

• 2000s






1980s
– single-chip CPUs, on-chip
RAM feasible
– simple, hard-wired control
– simple instruction sets
– small on-chip caches

even more transistors
Power wall
Transition to CMPs
Multi-level caches

• 2010s
– Embedded vs. Desktop vs.
Data center (cloud)
– New storage (PCM, flash)
– Simpler cores and lots of
them
– Optimizing for power
Lecture 1


12

Intel 4004 - 1971

• The first microprocessor
• 2,300 transistors
• 108 KHz
• 10m process

Lecture 1

13

Some Recent Chips!
Intel Pentium IV
• 42 million transistors
• 4GHz
• 0.13m process
• Could fit ~15,000 4004s on this

chip!

Intel’s net revenue was around $35 billion a year for most of the aughts
Intel Itanium II (Montecito)
R&D- about
billion a year
NVidia
GeForce $5
6800
• 222 million transistors
• 400MHz
• 0.13m process

Lecture 1

• 1.7 billion transistors
• 1.6 GHz
• 90nm process

IBM Cell

• 8 vector processors + 1
PPC
• 4 GHz
• 90nm process

14

Any Architecture You Want (as long as it is x86)

CS352H
Fall 2007

Lecture 1

15

Application Constraints


Applications drive machine

‘balance’
– Numerical simulations
• floating-point performance
• main memory bandwidth
– Transaction processing
• I/Os per second
• integer CPU performance
– Decision support
• I/O bandwidth
– Embedded control
• I/O timing, power
– Media processing
• low-precision ‘pixel’
arithmetic

Lecture 1

16

Application-Driven Architectures
• General purpose - good performance on “all”
programs
– x86 family, ARM, powerPC, etc.

• Application specificity can focus on:
– Types of concurrency available
– Domain of deployment (server, handheld, desktop)

• Today - overview of graphics processors
– Interface (instruction set architecture - ISA)
– Processor organization
– Concurrent elements

Lecture 1

17

Apple’s iPad/iPhone4 Powered by A4 Chip
• A4 is modified ARM Cortex run at 1GHz
– Integrated processor, graphics, memory controller

• Among other claims, ARM says the processors gets a
near "25 percent processing power boost, even at
same processor speed, from the use of a new
instruction pipelining system."
– We will cover pipelining in this class.

• Claim: 10 hours of 1024x768 video at 25W
• Let’s look at the Freescale i.MX51

CS352H
Fall 2007

Lecture 1

18

Performance: Latency and Throughput
• Latency: time to complete an operation
• Throughput: work completed per unit time
• Consider plumbing
– Low latency: turn on faucet and water comes out
– High bandwidth: lots of water (e.g., to fill a pool)

• What is “High speed Internet?”
– Low latency: needed to interactive gaming
– High bandwidth: needed for downloading large files
– Marketing departments like to conflate latency and
bandwidth…

Relationship between Latency and Throughput
• Latency and bandwidth only loosely coupled
– Henry Ford: assembly lines increase bandwidth without
reducing latency

• My factory takes 1 day to make a Model-T ford.





But I can start building a new car every 10 minutes
At 24 hrs/day, I can make 24 * 6 = 144 cars per day
A special order for 1 green car, still takes 1 day
Throughput is increased, but latency is not.

• Latency reduction is difficult
• Often, one can buy bandwidth
– E.g., more memory chips, more disks, more computers
– Big server farms (e.g., google) are high bandwidth

What is cloud computing?
• Cloud computing is where dynamically scalable and
often virtualized resources are provided as a service
over the Internet (thanks, wikipedia!)
• Infrastructure as a service (IaaS)
– Amazon’s EC2 (elastic compute cloud)

• Platform as a service (PaaS)
– Google gears
– Microsoft azure

• Software as a service (SaaS)
– gmail
– facebook
– flickr

Thanks, James Hamilton, amazon

Graphics has dedicated chip in PCs
Memory

Memory

Memory

Memory

Memory Controller Chip

CPU

(“North Bridge”)

582 Million
transistors

Input/Output Glue Chip
(“South Bridge”)

Graphics
Processor

681 Million
transistors
(GeForce 8800, 90nm)

(Intel “Kentsfield” quad core,
QX6700, 65nm, two dies, 8MB L2$)

(AGP, PCIe)

Disk, Keyboard, PCIe, etc.
Lecture 1

23

GFLOPS

GPU/CPU Performance comparison

* IBM Cell ~200 GFlops

Core 2 Quad 3GHz, 96 GFLOPS *

G80 = GeForce 8800 GTX
G71 = GeForce 7900 GTX
G70 = GeForce 7800 GTX
NV40 = GeForce 6800 Ultra
NV35 = GeForce FX 5950 Ultra
NV30 = GeForce FX 5800

Source: NVIDIA (except CELL and Core2 Quad)
Lecture 1

24

Why a dedicated processing chip?



1) Specialization – becoming less important with time
2) Parallelism – becoming more important
Graphics processors are the only highly-parallel
processors in every desktop machine.

128 “processors”
* 2 FLOPS
@ 1.35 GHz

You can program them

CS352H
Fall 2007

Lecture 1

25

Graphics requires programmability
Every application does something a bit different.
Example Cg “shader” program (invoked like a “callback”
function):
void normalmapped(float2 normalMapTexCoord : TEXCOORD0,

out float4 color : COLOR,
uniform float ambient,
…)
{
float3 normalTex, …;
normalTex = tex2D(normalMap, normalMapTexCoord).xyz;

diffuse = saturate(dot(normal, normLightDir);

color = Kd * (ambient + diffuse ) +
Ks * pow(specular, specularExponent;
}
Lecture 1

26

GeForce 8800

Lecture 1

27

Next Time





Performance evaluation
Basic computer organization
How chips are made
Start in on instruction set review/overview

• Always check web page for assignments

Lecture 1

28

Dokumen yang terkait

ANALISIS FAKTOR YANGMEMPENGARUHI FERTILITAS PASANGAN USIA SUBUR DI DESA SEMBORO KECAMATAN SEMBORO KABUPATEN JEMBER TAHUN 2011

2 53 20

KONSTRUKSI MEDIA TENTANG KETERLIBATAN POLITISI PARTAI DEMOKRAT ANAS URBANINGRUM PADA KASUS KORUPSI PROYEK PEMBANGUNAN KOMPLEK OLAHRAGA DI BUKIT HAMBALANG (Analisis Wacana Koran Harian Pagi Surya edisi 9-12, 16, 18 dan 23 Februari 2013 )

64 565 20

FAKTOR – FAKTOR YANG MEMPENGARUHI PENYERAPAN TENAGA KERJA INDUSTRI PENGOLAHAN BESAR DAN MENENGAH PADA TINGKAT KABUPATEN / KOTA DI JAWA TIMUR TAHUN 2006 - 2011

1 35 26

A DISCOURSE ANALYSIS ON “SPA: REGAIN BALANCE OF YOUR INNER AND OUTER BEAUTY” IN THE JAKARTA POST ON 4 MARCH 2011

9 161 13

Pengaruh kualitas aktiva produktif dan non performing financing terhadap return on asset perbankan syariah (Studi Pada 3 Bank Umum Syariah Tahun 2011 – 2014)

6 101 0

Pengaruh pemahaman fiqh muamalat mahasiswa terhadap keputusan membeli produk fashion palsu (study pada mahasiswa angkatan 2011 & 2012 prodi muamalat fakultas syariah dan hukum UIN Syarif Hidayatullah Jakarta)

0 22 0

Perlindungan Hukum Terhadap Anak Jalanan Atas Eksploitasi Dan Tindak Kekerasan Dihubungkan Dengan Undang-Undang Nomor 39 Tahun 1999 Tentang Hak Asasi Manusia Jo Undang-Undang Nomor 23 Tahun 2002 Tentang Perlindungan Anak

1 15 79

Pendidikan Agama Islam Untuk Kelas 3 SD Kelas 3 Suyanto Suyoto 2011

4 108 178

PP 23 TAHUN 2010 TENTANG KEGIATAN USAHA

2 51 76

KOORDINASI OTORITAS JASA KEUANGAN (OJK) DENGAN LEMBAGA PENJAMIN SIMPANAN (LPS) DAN BANK INDONESIA (BI) DALAM UPAYA PENANGANAN BANK BERMASALAH BERDASARKAN UNDANG-UNDANG RI NOMOR 21 TAHUN 2011 TENTANG OTORITAS JASA KEUANGAN

3 32 52