ipdps11EduParTalk.ppt 443KB Jun 23 2011 01:06:34 PM
Joint UIUC/UMD Parallel
Algorithms/Programming Course
David Padua, University of Illinois at Urbana-Champaign
Uzi Vishkin, University of Maryland, speaker
Jeffrey C. Carver, University of Alabama
Motivation 1/4
Programmers of today’s parallel machines must overcome 3
productivity busters, beyond just identifying operations that
can be executed in parallel:
(i) impose the often difficult 4-step programming-for-locality
recipe: decomposition, assignment, orchestration, and
mapping [CS99]
(ii) reason about concurrency in threads; e.g., race conditions
(iii) for machines such as GPU, that fall behind on serial (or
low parallelism) code, whole programs must be highly
parallel
2
Motivation 2/4: Commodity computer systems
If you want your program to run significantly faster … you’re going to
have to parallelize it
Parallelism: only game in town
But, where are the players?
“The Trouble with Multicore: Chipmakers are busy designing
microprocessors that most programmers can't handle”—D.
Patterson, IEEE Spectrum 7/2010
•
Only heroic programmers can exploit the vast parallelism in current
machines – Report by CSTB, U.S. National Academies 2011
•
An education agenda must: (i) recognize this reality, (ii) adapt to it,
and (iii) identify broad impact opportunities for education
Motivation 3/4: Technical Objectives
• Parallel computing exists for providing speedups over serial computing
• Its emerging democratization the general body of CS students &
graduates must be capable of achieving good speedups
What is at stake?
A general-purpose computer that can be programmed effectively by too
few programmers, or requires excessive learning application SW
development costs more, weakening market potential of not only the
computer:
Traditionally, Economists look to the manufacturing sector for bettering
the recovery prospects of the economy. Software production is the
quintessential 21st century mode of manufacturing. These prospects
are at peril if most programmers are unable to design effective
software for mainstream computers
4
Motivation 4/4: Possible Roles for Education
• Facilitator. Prepare & train students and the workforce
for a future dominated by parallelism.
• Testbed. Experiment with vertical approaches and
refine them to identify the most cost-effective ways
for achieving speedups.
• Benchmark. Given a vertical approach, identify the
developmental stage at which it can be taught.
Rationale: Ease of learning/teaching is a necessary
(though not sufficient) condition for ease-ofprogramming
5
The joint inter-university course
• UIUC: Parallel Programming for Science and Engineering, Prof: DP
• UMD: Parallel Algorithms, Prof: UV
• Student population: upper-division undergrads and graduate
students. Diverse majors and backgrounds
• ~1/2 of the fall 2010 sessions, joint by videoconferencing.
Objectives
1. Demonstrate logistical and educational feasibility of a real-time cotaught course.
Outcome Overall success. Minimal glitches. Helped to alert students
that success on material taught by the other prof is as important.
2. Compare OpenMP using 8-processor SMP against PRAM/XMTC using
64-processor XMT (
Algorithms/Programming Course
David Padua, University of Illinois at Urbana-Champaign
Uzi Vishkin, University of Maryland, speaker
Jeffrey C. Carver, University of Alabama
Motivation 1/4
Programmers of today’s parallel machines must overcome 3
productivity busters, beyond just identifying operations that
can be executed in parallel:
(i) impose the often difficult 4-step programming-for-locality
recipe: decomposition, assignment, orchestration, and
mapping [CS99]
(ii) reason about concurrency in threads; e.g., race conditions
(iii) for machines such as GPU, that fall behind on serial (or
low parallelism) code, whole programs must be highly
parallel
2
Motivation 2/4: Commodity computer systems
If you want your program to run significantly faster … you’re going to
have to parallelize it
Parallelism: only game in town
But, where are the players?
“The Trouble with Multicore: Chipmakers are busy designing
microprocessors that most programmers can't handle”—D.
Patterson, IEEE Spectrum 7/2010
•
Only heroic programmers can exploit the vast parallelism in current
machines – Report by CSTB, U.S. National Academies 2011
•
An education agenda must: (i) recognize this reality, (ii) adapt to it,
and (iii) identify broad impact opportunities for education
Motivation 3/4: Technical Objectives
• Parallel computing exists for providing speedups over serial computing
• Its emerging democratization the general body of CS students &
graduates must be capable of achieving good speedups
What is at stake?
A general-purpose computer that can be programmed effectively by too
few programmers, or requires excessive learning application SW
development costs more, weakening market potential of not only the
computer:
Traditionally, Economists look to the manufacturing sector for bettering
the recovery prospects of the economy. Software production is the
quintessential 21st century mode of manufacturing. These prospects
are at peril if most programmers are unable to design effective
software for mainstream computers
4
Motivation 4/4: Possible Roles for Education
• Facilitator. Prepare & train students and the workforce
for a future dominated by parallelism.
• Testbed. Experiment with vertical approaches and
refine them to identify the most cost-effective ways
for achieving speedups.
• Benchmark. Given a vertical approach, identify the
developmental stage at which it can be taught.
Rationale: Ease of learning/teaching is a necessary
(though not sufficient) condition for ease-ofprogramming
5
The joint inter-university course
• UIUC: Parallel Programming for Science and Engineering, Prof: DP
• UMD: Parallel Algorithms, Prof: UV
• Student population: upper-division undergrads and graduate
students. Diverse majors and backgrounds
• ~1/2 of the fall 2010 sessions, joint by videoconferencing.
Objectives
1. Demonstrate logistical and educational feasibility of a real-time cotaught course.
Outcome Overall success. Minimal glitches. Helped to alert students
that success on material taught by the other prof is as important.
2. Compare OpenMP using 8-processor SMP against PRAM/XMTC using
64-processor XMT (