Komputasi Kinerja Tinggi
Komputasi Kinerja Tinggi
Lembaga Ilmu Pengetahuan Indonesia
Pusat Penelitian Informatika
Tahun 1965 : Lembaga
Elektroteknika Nasional
(LEN)
Keppres No. 1 Tahun
1986 : Puslitbang
Telkoma, Inkom, Telimek
dan UPT Pusat LEN
Tahun 1990 :
UPT LEN diserahkan ke
BPIS (spin-off)
SK Ka. LIPI No.
1151/M/2001 : Pusat
Penelitian Informatika
Profil SDM
Fungsional Peneliti&Kandidat, Non Peneliti dan
fungsional Umum
Wan
ita
30%
Peneliti: 54 Orang
Pria
70%
27%
Fungsional Non
Peneliti: 3 Orang
59%
11%
RANGE USIA
Fungsional Umum adm:
24 Orang
3%
20
15
10
5
0
Series1
Penata Teknis: 10
Orang
26-30
31-35
36-40
41-45
46-50
51-55
56-60
61-65
11
15
19
19
11
8
5
3
Riset @P2Informatika
• Computational Science
Beberapa Hasil Penelitian
PEMANFAATAN TEKNOLOGI BERBASIS VISUAL UNTUK MENILAI KUALITAS PRODUK
Beberapa Hasil Penelitian
Text Data model for Weather Data
DNA QR codes of (a). JX426135; (B) JN245997; (c)
JN245994; (d) JN632605
Data Hiding
Scheme for
Digital Image
Simulasi Curah Hujan di Indonesia
Adaptasi model iklim wilayah Indonesia
menggunakan REGCM 4.0
Pemanfaatan RegCM4 (Regional Climate Model) untuk simulasi iklim
spesifik untuk wilayah Indonesia.
Perambatan Energi Gelombang
Bertujuan untuk melacak perambatan gelombang dengan potensi energi
yang cukup besar. Penelitian berfokus pada perambatan energi
gelombang permukaan air.
Density plot tinggi dan energi gelombang
Simulasi Dinamika Populasi Nyamuk dengan
Cellular
Model dan simulasi dinamika populasi nyamuk merupakan studi
bidang komputasi biologi untuk memahami perubahan ukuran
populasi suatu spesies. Hasil simulasi menunjukkan model yang
diusulkan mampu mensimulasikan populasi nyamuk secara temporal
dan spasial.
Simulasi Dispersal Nyamuk
Pengembangan Algoritma Sistem
Uji Berbasis Visual
Sistem Uji Berbagai Parameter Kualitas
2015
Pengujian Cip Sensor
Implementasi Pengujian pada Produksi Massal
CERN, Swiss
Produksi massal
Cip
terseleksi
2016
Cip hasil
produksi
Pengujian Visual
Riset @P2Informatika
• Big Data & IoT
Jenis Layanan
Layanan Komputasi untuk Publik
Layanan Diseminasi Teknologi Komputasi Berkinerja Tinggi
HPC LIPI @P2Informatika
Fasilitas
Cibinong
Gedung Pusat Inovasi
Jl. Raya Jakarta-Bogor
KM 47
Cibinong, Jawa Barat
Bandung
Gedung 10 Kompleks
LIPI
Jl. Cisitu No. 21
Bandung, Jawa Barat
Fasilitas HPC
Master Node (4 Node)
Prosesor: 2 x 8 core
Intel Xeon E5 Family
Memori: 128 GB
Storage: 24 TB
GPU Node (20 Node)
- Prosesor: 2 x 4 core
Intel Xeon E5 Family
- Memori: 8 - 16 GB
- Storage: 500 GB
- GPU Tesla M2075
(488 core)
Basic Node (114 Node)
Prosesor: 2 x 4 core
Intel Xeon E5 Family
Memori: 8 - 16 GB
Storage: 500 GB
High Memory Node (8
Node)
- Prosesor: 2 x 8 core
Intel Xeon E5 Family
- Memori: 256 GB
- Storage: 2 x 300 GB
HPC LIPI @P2Informatika
• Cibinong
− 928 Core
− 3072 GB RAM
− 103 TB Space
−
• Bandung
− 336 Core
− 560 GB RAM
− 67 TB Space
HPC Cibinong
HPC LIPI @P2Informatika
Apa itu Komputasi paralel?
Komputasi Serial:
Komputer desktop
konvensional memiliki Central
Processing Unit tunggal (CPU)
dan komputasi dilakukan
dengan memecah problem
menjadi serangkaian perintah
diskrit.
Perintah di eksekusi oleh
komputer satu persatu, karena
hanya satu perintah yang
dapat dijalankan dalam satu
waktu.
Apa itu Komputasi paralel?
Komputasi Paralel:
Sedangkan Hardware High
Performance Computing terdiri dari
beberapa CPU dan dikonfigurasi untuk
menjalankan perhitungan paralel.
Setiap problem harus dipecah menjadi
bagian-bagian diskrit yang dapat
dikomputasi secara konkuren
Setiap bagian kemudian dipecah
menjadi serangkaian perintah.
Perintah tersebut dikomputasi secara
simultan di CPU yang berbeda.
Apa itu Komputasi paralel?
•Masalah komputasi yang akan dijalankan di HPC harus
dapat:
•Dipisah-pisahkan menjadi potongan-potongan diskrit
pekerjaan yang dapat diselesaikan secara bersamaan;
•Mengeksekusi beberapa instruksi program pada setiap saat
dalam waktu;
•Diselesaikan dalam waktu kurang dengan beberapa sumber
komputasi daripada dengan sumber daya komputasi tunggal.
•sumber daya komputasi biasanya:
•komputer dengan prosesor / core banyak
•Sejumlah komputer yang terhubung dengan jaringan
Apa itu Komputasi paralel?
Jika anda memiliki aplikasi komputasi favorit
Satu prosesor akan memberi hasil dalam N jam.
Mengapa tidak menggunakan N prosesor
-- dan mendapat hasil hanya dalam 1 jam?
Konsepnya :
Parallelism = menggunakan beberapa prosesor pada sebuah problem
► Dua komponen parallel programming
Komputasi
Komunikasi
A Computer Cluster
Regular PC
A computer cluster
Front-end node
1 CPU
1 or 2 Hard disks
Some memory
512MB,.. 1GB,..
Compute-0-0
Compute-0-1
Compute-0-2
Parallel computing is computing by committee
komputasi paralel: penggunaan beberapa komputer atau
prosesor yang bekerja bersama-sama dalam tugas bersama.
Setiap prosesor bekerja pada bagiannya masing-masing dari
problem
Prosesor diperbolehkan untuk bertukar informasi dengan prosesor
lainnya
Grid of Problem to be solved
exchange
CPU #2 works on this area
of the problem
exchange
CPU #3 works on this area
of the problem
exchange
exchange
y
CPU #1 works on this area
of the problem
CPU #4 works on this area
of the problem
x
Mengapa menggunakan HPC?
Data + Simulation = Innovation
“Calculation will increasingly
replace experimentation in design
of useful materials, catalysts, and
drugs, leading to much greater
efficiency and new opportunities
for creativity”
-- Frank Wilczek, Physics in 100 Years
Data + Simulation = Innovation
Mengapa
menggunakan
HPC?
Mengapa menggunakan HPC?
Dunia nyata parallel secara masiv:
Di dunia nyata, banyak peristiwa yang kompleks dan saling terkait yang terjadi
pada saat yang sama, namun dalam urutan temporal.
Dibandingkan dengan komputasi serial, komputasi paralel jauh lebih cocok untuk
pemodelan, simulasi dan pemahaman fenomena dunia nyata yang kompleks.
Misalnya, bayangkan melakukan pemodelan hal-2 berikut secara
serial:
Mengapa menggunakan HPC?
Misalnya, bayangkan melakukan pemodelan hal-2 berikut
secara serial:
Menghemat waktu dan/atau uang
• Secara teori, menggunakan lebih banyak sumber daya pada
sebuah pekerjaan akan mempersingkat waktu penyelesaian,
dengan potensi penghematan biaya.
• Komputer paralel dapat dibangun dari komponen komoditas
yang murah.
Menghemat
waktu
dan/atau
uang
Menghemat
waktu
dan/atau
uang
MEMECAHKAN MASALAH YANG LEBIH BESAR / KOMPLEKS:
Banyak masalah yang begitu besar dan / atau kompleks yang secara
teknis tidak atau tidak mungkin untuk dipecahkan dengan satu
komputer, terutama mengingat memori komputer yang terbatas.
Contoh:
Mesin pencari / database pengolahan jutaan transaksi setiap detik
“Masalah yang menjadi tantangan besar”
(en.wikipedia.org/wiki/Grand_Challenge) membutuhkan sumber daya komputasi
petaflops dan petabyte.
Grand challenge Problem
Solving grand challenge applications using computer
modeling, simulation and analysis
Life Sciences
CAD/CAM
Aerospace
Digital Biology
E-commerce/anything
Military Applications
Grand challenges Problem
Life Sciene
Life
Sciene
Engine Combustion Research Group
Signal Processing/Quantum Mechanics
Convolution model (stencil)
Matrix computations (eigenvalues…)
Conjugate gradient methods
Normally not very demanding on latency and bandwidth
Some algorithms are embarrassingly parallel
Examples: seismic migration/processing, medical imaging, SETI@Home
Signal Processing Example
Pekerjaan Dilakukan Secara Konkuren
Sebuah sumber daya komputasi tunggal hanya dapat melakukan satu hal pada
suatu waktu.
Beberapa sumber daya komputasi dapat melakukan banyak hal secara
bersamaan.
Contoh: Jaringan Kolaborasi menyediakan tempat global di mana orang-orang dari
seluruh dunia dapat bertemu dan melakukan pekerjaan “secara virtual".
Siapa yang Menggunakan Parallel Computing?
Science dan Engineering :
Secara historis, komputasi paralel telah dianggap “komputasi high end”, dan telah
digunakan untuk memodelkan masalah sulit di banyak bidang ilmu pengetahuan dan
teknik:
Atmosphere, Earth, Environment
Physics - applied, nuclear, particle, condensed matter, high pressure, fusion, photonics
Bioscience, Biotechnology, Genetics
Chemistry, Molecular Sciences
Geology, Seismology
Mechanical Engineering - from prosthetics to spacecraft
Electrical Engineering, Circuit Design, Microelectronics
Computer Science, Mathematics
Defense, Weapons
HPC Applications and Major Industries
42
Finite Element Modeling
Auto/Aero
Fluid Dynamics
Auto/Aero, Consumer Packaged Goods
Mfgs, Process Mfg, Disaster Preparedness
(tsunami)
Imaging
Seismic & Medical
Finance
Banks, Brokerage Houses (Regression
Analysis, Risk, Options Pricing, What if, …)
Molecular Modeling
Complex Problems, Large Datasets, Long Runs
Biotech and Pharmaceuticals
This slide is from Intel presentation “Technologies for Delivering Peak Performance on HPC and Grid Applications”
5 January 2017
43
Divide and Conquer
Says 1 CPU
1,000,000 elements
Numerical processing for 1
element = .1 secs
One computer will take
100,000 secs = 27.7 hrs
Says 100 CPUs
.27 hr ~ 16 mins
5 January 2017
Life Science Problem – an example of Protein
Folding
Take a computing year (in serial mode) to do molecular
dynamics simulation for a protein folding problem
•Excerpted from IBM David Klepacki’s The future of HPC
•Petaflop = a thousand trillion floating point operations per second
5 January 2017
Disaster Preparedness
Project LEAD
Severe Weather prediction
(Tornado) – OU leads.
HPC & Dynamically
adaptation to weather
forecast
Professor Seidel’s LSU CCT
Hurricane Route Prediction
Emergency Preparedness
Show Movie – HPC-enabled
Simulation
5 January 2017
Cancer Gene-mining
Unsuccessful on a uni-processor
Approach
Novel parallel gene-mining algorithms
Input from microarray
Retain accuracy
Significantly speed up (superlinear)
IBM P5 supercomputer (128 node PPC).
Time to run the algorithm, keeping number of nodes fixed
Mesothelioma
Time taken(in secs)
1200
1000
Breast
80
60
Renal
Leukemia
40
800
20
600
0
Prostate
Lung
400
Pancreas
200
0
13
39
65
Number of processors
5 January 2017
Bladder
100
91
Colorectal
Ovary
Lymphoma
Melanoma
OvaMarker based Selection
GeneSetMine based Selection
46
Did you know that Playstation 3 is a
HPC/Supercomputer?
9 cores/CPUs in one chip.
Future gaming software is no longer graphic or multimedia only
This diagram is from an article from IBM Cell processor & compiler challenge
5 January 2017
Global Climate Modeling Problem
48
Problem is to compute:
f(latitude, longitude, elevation, tim
e)
temperature, pressure, humidity,
wind velocity
Approach:
Discretize the domain, e.g., a
measurement point every 10 km
Devise an algorithm to predict
weather at time t+1 given t
• Uses:
- Predict major events, e.g., El Nino
- Use in setting air emissions standards
C Cox
49
Global Climate Modeling Computation
Computational requirements:
To match real-time, need 5x 1011 flops in 60 seconds = 8
Gflop/s
Weather prediction (7 days in 24 hours) 56 Gflop/s
Climate prediction (50 years in 30 days) 4.8 Tflop/s
To use in policy negotiations (50 years in 12 hours) 288
Tflop/s
To double the grid resolution, computation is at least 8x
State of the art models require integration of atmosphere, ocean, sea-ice, land models, plus
possibly carbon cycle, geochemistry and more
Current models are coarser than this
C Cox
50
Heart Simulation
Problem is to compute blood flow in the heart
Approach:
Modeled as an elastic structure in an incompressible fluid.
The “immersed boundary method” due to Peskin and McQueen.
20 years of development in model
Many applications other than the heart: blood clotting, inner
ear, paper making, embryo growth, and others
Use a regularly spaced mesh (set of points) for evaluating the fluid flow
Uses
Current model can be used to design artificial heart valves
Can help in understand effects of disease (leaky valves)
Related projects look at the behavior of the heart during a heart attack
Ultimately: real-time clinical work
C Cox
51
•
•
•
•
Parallel computing: Web searching
Functional parallelism: crawling, indexing, sorting
Parallelism between queries: multiple users
Finding information amidst junk
Preprocessing of the web data set to help find information
• General themes of sifting through large, unstructured data
sets:
C Cox
52
Parallel Programming: Decomposition Techniques
Functional Decomposition (Functional Parallelism)
Decomposing the problem into different tasks which can be distributed
to multiple processors for simultaneous execution
Good to use when there is not static structure or fixed determination of
number of calculations to be performed
Domain Decomposition (Data Parallelism)
Partitioning the problem's data domain and distributing portions to
multiple processors for simultaneous execution
Good to use for problems where:
data is static (e.g. solving large matrix or finite difference or finite element calculations)
dynamic data structure tied to single entity where entity can be separated
domain is fixed but computation within various regions of the domain is dynamic (fluid vortices models)
Combination of functional and domain decomposition
C Cox
Siapa yang Menggunakan Parallel Computing?
Bioscience, Biotechnol
ogy, Genetics
Atmosphere, Earth, En
vironment
Siapa yang Menggunakan Parallel Computing?
Siapa yang Menggunakan Parallel Computing?
Industrial and Commercial
Aplikasi-aplikasi berikut memerlukan pengolahan data dalam jumlah besar dengan cara yang
canggih.
Big Data, databases, data
mining
Oil exploration
Web search engines, web
based business services
Medical imaging and
diagnosis
Pharmaceutical design
Financial and economic modeling
Management of national and multinational corporations
Advanced graphics and virtual
reality, particularly in the
entertainment industry
Networked video and multi-media
technologies
Collaborative work environments
Top Ten Most Powerful Computers http://www.top500.org)
#
Site
1 National Supercomputing Center
in Wuxi China
2 National Super Computer Center
in Guangzhou China
System
Sunway TaihuLight - Sunway MPP, Sunway SW26010 260C
1.45GHz, Sunway NRCPC
Tianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E5-2692
12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P NUDT
3 DOE/SC/Oak Ridge National
Laboratory US
4 DOE/NNSA/LLNL US
Titan - Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x Cray Inc.
Sequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM
5 DOE/SC/LBNL/NERSC US
Cori - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries
interconnect Cray Inc.
Oakforest-PACS - PRIMERGY CX1640 M1, Intel Xeon Phi 7250
68C 1.4GHz, Intel Omni-Path Fujitsu
K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu
6 Joint Center for Advanced HPC
Japan
7 RIKEN (AICS) Japan
8 Swiss National Supercomputing
Centre (CSCS) Switzerland
9 DOE/SC/Argonne National
Laboratory US
10 DOE/NNSA/LANL/SNL US
Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries
interconnect , NVIDIA Tesla P100 Cray Inc.
Mira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM
Trinity - Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries
interconnect Cray Inc.
Rmax
Cores
(TFlop/s)
10,649,600 93,014.6
Rpeak
Power
(TFlop/s)
(kW)
125,435.9 15,371
3,120,000
33,862.7
54,902.4
17,808
560,640
17,590.0
27,112.5
8,209
1,572,864
17,173.2
20,132.7
7,890
622,336
14,014.7
27,880.7
3,939
556,104
13,554.6
24,913.5
2,719
705,024
10,510.0
11,280.4
12,660
206,720
9,779.0
15,988.0
1,312
786,432
8,586.6
10,066.3
3,945
301,056
8,100.9
11,078.9
4,233
Computer Food Chain
Original Food Chain Picture
1984 Computer Food Chain
Mainframe
Mini Computer
Vector Supercomputer
Workstation
PC
1994 Computer Food Chain
(hitting wall soon)
Mini Computer
Workstation
(future is bleak)
Mainframe
Vector Supercomputer
MPP
PC
Computer Food Chain (Now and Future)
CLUSTERING OF COMPUTERS
FOR COLLECTIVE COMPUTING: TRENDS
?
1960
1990
1995+ 2000
Computing Platforms Evolution
Breaking Adm inistrative Barriers
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
?
P
E
R
F
O
R
M
A
N
C
E
2 1 0
21 00
Administrative Barriers
Individual
Group
D epart ment
C ampus
Sta te
N ational
Globe
Inte r Plane t
U niverse
Desktop
(Single Proc es sor?)
SMPs or
SuperC om
puters
Local
Cluster
Enterprise
Cluster/Grid
Global
Cluster/Grid
Inter Plan et
Cluster/Grid ??
Cluster Computer and its
Components
Clustering gained momentum when 3 technologies
converged:
1. Very HP Microprocessors
workstation performance = yesterday supercomputers
2. High speed communication
Comm. between cluster nodes >= between processors in an SMP.
3. Standard tools for parallel/ distributed
computing & their growing popularity.
Parallel architectures (1)
Vector machines
CPU processes multiple data sets
shared memory
advantages: performance, programming difficulties
issues: scalability, price
examples: Cray SV, NEC SX, Athlon3/d, Pentium- IV/SSE/SSE2
Massively parallel processors (MPP)
large number of CPUs
distributed memory
advantages: scalability, price
issues: performance, programming difficulties
examples: ConnectionSystemsCM1 i CM2, GAAP (GeometricArrayParallel Processor)
Parallel architectures (2)
Symmetric Multiple Processing (SMP)
two or more processors
shared memory
advantages: price, performance, programming difficulties
issues: scalability
examples: UltraSparcII, Alpha ES, Generic Itanium, Opteron, Xeon, …
Non Uniform Memory Access (NUMA)
Solving SMP’sscalability issue
hybrid memory model
advantages: scalability
issues: price, performance, programming difficulties
examples: SGI Origin/Altix, Alpha GS, HP Superdome
Clusters
Cluster consists of:
Nodes
Network
OS
Cluster middleware
Standard components
Avoiding expensive proprietary components
Cluster Architecture
Sequential Applications
Sequential Applications
Sequential Applications
Parallel Applications
Parallel Applications
Parallel Applications
Parallel Programming Environment
Cluster Middleware
(Single System Image and Availability Infrastructure)
PC/Workstation
PC/Workstation
PC/Workstation
PC/Workstation
Communications
Communications
Communications
Communications
Software
Software
Software
Software
Network Interface
Hardware
Network Interface
Hardware
Cluster Interconnection Network/Switch
Network Interface
Hardware
Network Interface
Hardware
Cluster Components...1a
Nodes
Multiple High Performance Components:
PCs
Workstations
SMPs (CLUMPS)
Distributed HPC Systems leading to
Metacomputing
They can be based on different
architectures and running difference OS
Cluster Components...1b
Processors
There are many (CISC/RISC/VLIW/Vector..)
Intel: Pentiums, Xeon, Merceed….
Sun: SPARC, ULTRASPARC
HP PA
IBM RS6000/PowerPC
SGI MPIS
Digital Alphas
Integrate Memory, processing and networking into a single
chip
IRAM (CPU & Mem):
(http://iram.cs.berkeley.edu)
Alpha 21366 (CPU, Memory Controller, NI)
Cluster Components…2
OS
State of the art OS:
Linux
(Beowulf)
Microsoft NT (Illinois HPVM)
SUN Solaris (Berkeley NOW)
IBM AIX (IBM SP2)
HP UX
(Illinois - PANDA)
Mach (Microkernel based OS) (CMU)
Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX
(academic project)
OS gluing layers:
(Berkeley Glunix)
Cluster Components…3
High Performance Networks
Ethernet (10Mbps),
Fast Ethernet (100Mbps),
Gigabit Ethernet (1Gbps)
SCI (Dolphin - MPI- 12micro-sec latency)
ATM
Myrinet (1.2Gbps)
Digital Memory Channel
FDDI
Cluster Components…4
Network Interfaces
Network Interface Card
Myrinet has NIC
User-level access support
Alpha 21364 processor integrates
processing, memory controller, network
interface into a single chip..
Cluster Components…
5 Communication Software
Traditional OS supported facilities (heavy weight due
to protocol processing)..
Sockets (TCP/IP), Pipes, etc.
Light weight protocols (User Level)
Active Messages (Berkeley)
Fast Messages (Illinois)
U-net (Cornell)
XTP (Virginia)
System systems can be built on top of the above
protocols
Cluster Components…6a
Cluster Middleware
Resides Between OS and Applications and
offers in infrastructure for supporting:
Single System Image (SSI)
System Availability (SA)
SSI makes collection appear as single
machine (globalised view of system
resources). Telnet cluster.myinstitute.edu
SA - Check pointing and process migration..
Cluster Components…6b
Middleware Components
Hardware
DEC Memory Channel, DSM (Alewife, DASH) SMP Techniques
OS / Gluing Layers
Solaris MC, Unixware, Glunix)
Applications and Subsystems
System management and electronic forms
Runtime systems (software DSM, PFS etc.)
Resource management and scheduling (RMS):
CODINE, LSF, PBS, NQS, etc.
Cluster Components…7a
Programming environments
Threads (PCs, SMPs, NOW..)
POSIX Threads
Java Threads
MPI
Linux, NT, on many Supercomputers
PVM
Software DSMs (Shmem)
Cluster Components…7b
Development Tools ?
Compilers
C/C++/Java/ ;
Parallel programming with C++ (MIT Press book)
RAD (rapid application development tools)..
GUI based tools for PP modeling
Debuggers
Performance Analysis Tools
Visualization Tools
Cluster Components…8
Applications
Sequential
Parallel / Distributed (Cluster-aware app.)
Grand Challenging applications
Weather Forecasting
Quantum Chemistry
Molecular Biology Modeling
Engineering Analysis (CAD/CAM)
……………….
PDBs, web servers,data-mining
Classification
of Cluster Computer
Clusters Classification..1
Based on Focus (in Market)
High Performance (HP) Clusters
Grand Challenging Applications
High Availability (HA) Clusters
Mission Critical applications
Clusters Classification..2
Based on Workstation/PC Ownership
Dedicated Clusters
Non-dedicated clusters
Adaptive parallel computing
Also called Communal
multiprocessing
Clusters Classification..3
Based on Node Architecture..
Clusters of PCs (CoPs)
Clusters of Workstations (COWs)
Clusters of SMPs (Symmetric
Multiprocessors)(CLUMPs)
Clusters Classification..4
Based on Node OS Type..
Linux Clusters (Beowulf)
Solaris Clusters (Berkeley NOW)
NT Clusters (HPVM)
AIX Clusters (IBM SP2)
SCO/Compaq Clusters (Unixware)
…….Digital VMS Clusters, HP
clusters, ………………..
Clusters Classification..5
Based on node components architecture &
configuration (Processor Arch, Node Type:
PC/Workstation.. & OS: Linux/NT..):
Homogeneous Clusters
All nodes will have similar configuration
Heterogeneous Clusters
Nodes based on different processors
and running different OSes.
Clusters Classification..6a
Dimensions of Scalability & Levels of Clustering
(3)
Network
Public
Enterprise
Metacomputing (GRID)
Technology
(1)
Campus
Department
Workgroup
Uniprocessor
SMP
Cluster
MPP
Platform
(2)
Clusters Classification..6b
Levels of Clustering
Group Clusters (#nodes: 2-99)
(a set of dedicated/non-dedicated computers - mainly connected by SAN like
Myrinet)
Departmental Clusters (#nodes: 99-999)
Organizational Clusters (#nodes: many 100s)
(using ATMs Net)
Internet-wide Clusters=Global Clusters: (#nodes: 1000s to many millions)
Metacomputing
Web-based Computing
Agent Based Computing
Java plays a major in web and agent based computing
Major issues in cluster design
Size Scalability (physical &
application)
Enhanced Availability (failure
management)
Single System Image (look-andfeel of one system)
Fast Communication (networks &
protocols)
Load Balancing (CPU, Net,
Memory, Disk)
Security and Encryption (clusters of
clusters)
Distributed Environment (Social
issues)
Programmability (simple API if
required)
Manageability (admin. And
control)
Applicability (cluster-aware and
non-aware app.)
What Next ??
Clusters of Clusters (HyperClusters)
Global Grid
Interplanetary Grid
Universal Grid??
Clusters of Clusters (HyperClusters)
Cluster 1
Scheduler
Master
Daemon
Submit
Graphical
Control
Clients
Cluster 2
Master
Daemon
Clients
Execution
Daemon
Cluster 3
Scheduler
Master
Daemon
Submit
Graphical
Control
Scheduler
Submit
Graphical
Control
LAN/WAN
Clients
Execution
Daemon
Execution
Daemon
Towards Grid Computing….
What is Grid ?
An infrastructure that couples
Computers (PCs, workstations, clusters, traditional
supercomputers, and even laptops, notebooks, mobile
computers, PDA, and so on)
Software ? (e.g., renting expensive special purpose
applications on demand)
Databases (e.g., transparent access to human genome
database)
Special Instruments (e.g., radio telescope--SETI@Home
Searching for Life in galaxy, Austrophysics@Swinburne for
pulsars)
People (may be even animals who knows ?)
across the local/wide-area networks (enterprise, organisations, or Internet)
and presents them as an unified integrated (single) resource.
Conceptual view of the Grid
Leading to Portal (Super)Computing
http://www.sun.com/hpc/
Grid Application-Drivers
Old and New applications getting enabled due to
coupling of computers, databases, instruments, people,
etc:
(distributed) Supercomputing
Collaborative engineering
high-throughput computing
large scale simulation & parameter studies
Remote software access / Renting Software
Data-intensive computing
On-demand computing
Grid Components
Applications and Portals
Scientific
Engineering
Collaboration
…
Prob. Solving Env.
Development Environments and Tools
Languages
Libraries
Debuggers
Monitoring
Resource Brokers
Web enabled Apps
…
Distributed Resources Coupling Services
Comm.
Sign on & Security
Information
Process
Data Access
Web tools
…
QoS
Grid Apps.
Grid Tools
Grid Middleware
Local Resource Managers
Operating Systems
Computers
Queuing Systems
Clusters
Libraries & App Kernels
Networked Resources across
Organisations
Storage Systems
Data Sources
…
…
TCP/IP & UDP
Scientific Instruments
Grid Fabric
Many GRID Projects and Initiatives
Europe
USA
Globus
Legion
JAVELIN
AppLes
NASA IPG
Condor
Harness
NetSolve
NCSA Workbench
WebFlow
EveryWhere
and many more...
UNICORE
MOL
METODIS
Globe
Poznan
Metacomputing
CERN Data Grid
MetaMPI
DAS
JaWS
and many more...
Australia
Nimrod/G
EcoGrid and GRACE
DISCWorld
PUBLIC FORUMS
Computing Portals
Grid Forum
European Grid Forum
IEEE TFCC!
GRID’2000 and more.
Public Grid Initiatives
Distributed.net
SETI@Home
Compute Power Grid
Japan
Ninf
Bricks
and many more...
Literature on Cluster
Computing
Reading Resources..1
Internet & WWW
Computer Architecture:
http://www.cs.wisc.edu/~arch/www/
Linux Parallel Procesing
http://yara.ecn.purdue.edu/~pplinux/Sites/
Solaris-MC
http://www.sunlabs.com/research/solaris-mc
Microprocessors: Recent Advances
http://www.microprocessor.sscc.ru
Beowulf:
http://www.beowulf.org
Metacomputing
http://www.sis.port.ac.uk/~mab/Metacomputing/
Reading Resources..2
Books
In Search of Cluster
by G.Pfister, Prentice Hall (2ed), 98
High Performance Cluster Computing
Volume1: Architectures and Systems
Volume2: Programming and Applications
Edited by Rajkumar Buyya, Prentice Hall, NJ, USA.
Scalable Parallel Computing
by K Hwang & Zhu, McGraw Hill,98
Cluster Computing Forum
IEEE Task Force on Cluster Computing
(TFCC)
http://www.ieeetfcc.org
TFCC Activities...
Network Technologies
OS Technologies
Parallel I/O
Programming Environments
Java Technologies
Algorithms and Applications
>Analysis and Profiling
Storage Technologies
High Throughput Computing
TFCC Activities...
High Availability
Single System Image
Performance Evaluation
Software Engineering
Education
Newsletter
Industrial Wing
TFCC Regional Activities
All the above have there own pages, see pointers from:
http://www.ieeetfcc.org
TFCC Activities...
Mailing list, Workshops, Conferences, Tutorials, Web-resources etc.
Resources for introducing subject in senior undergraduate and
graduate levels.
Tutorials/Workshops at IEEE Chapters..
….. and so on.
FREE MEMBERSHIP, please join!
Visit TFCC Page for more details:
http://www.ieeetfcc.org (updated daily!).
TERIMA KASIH
Lembaga Ilmu Pengetahuan Indonesia
Pusat Penelitian Informatika
Tahun 1965 : Lembaga
Elektroteknika Nasional
(LEN)
Keppres No. 1 Tahun
1986 : Puslitbang
Telkoma, Inkom, Telimek
dan UPT Pusat LEN
Tahun 1990 :
UPT LEN diserahkan ke
BPIS (spin-off)
SK Ka. LIPI No.
1151/M/2001 : Pusat
Penelitian Informatika
Profil SDM
Fungsional Peneliti&Kandidat, Non Peneliti dan
fungsional Umum
Wan
ita
30%
Peneliti: 54 Orang
Pria
70%
27%
Fungsional Non
Peneliti: 3 Orang
59%
11%
RANGE USIA
Fungsional Umum adm:
24 Orang
3%
20
15
10
5
0
Series1
Penata Teknis: 10
Orang
26-30
31-35
36-40
41-45
46-50
51-55
56-60
61-65
11
15
19
19
11
8
5
3
Riset @P2Informatika
• Computational Science
Beberapa Hasil Penelitian
PEMANFAATAN TEKNOLOGI BERBASIS VISUAL UNTUK MENILAI KUALITAS PRODUK
Beberapa Hasil Penelitian
Text Data model for Weather Data
DNA QR codes of (a). JX426135; (B) JN245997; (c)
JN245994; (d) JN632605
Data Hiding
Scheme for
Digital Image
Simulasi Curah Hujan di Indonesia
Adaptasi model iklim wilayah Indonesia
menggunakan REGCM 4.0
Pemanfaatan RegCM4 (Regional Climate Model) untuk simulasi iklim
spesifik untuk wilayah Indonesia.
Perambatan Energi Gelombang
Bertujuan untuk melacak perambatan gelombang dengan potensi energi
yang cukup besar. Penelitian berfokus pada perambatan energi
gelombang permukaan air.
Density plot tinggi dan energi gelombang
Simulasi Dinamika Populasi Nyamuk dengan
Cellular
Model dan simulasi dinamika populasi nyamuk merupakan studi
bidang komputasi biologi untuk memahami perubahan ukuran
populasi suatu spesies. Hasil simulasi menunjukkan model yang
diusulkan mampu mensimulasikan populasi nyamuk secara temporal
dan spasial.
Simulasi Dispersal Nyamuk
Pengembangan Algoritma Sistem
Uji Berbasis Visual
Sistem Uji Berbagai Parameter Kualitas
2015
Pengujian Cip Sensor
Implementasi Pengujian pada Produksi Massal
CERN, Swiss
Produksi massal
Cip
terseleksi
2016
Cip hasil
produksi
Pengujian Visual
Riset @P2Informatika
• Big Data & IoT
Jenis Layanan
Layanan Komputasi untuk Publik
Layanan Diseminasi Teknologi Komputasi Berkinerja Tinggi
HPC LIPI @P2Informatika
Fasilitas
Cibinong
Gedung Pusat Inovasi
Jl. Raya Jakarta-Bogor
KM 47
Cibinong, Jawa Barat
Bandung
Gedung 10 Kompleks
LIPI
Jl. Cisitu No. 21
Bandung, Jawa Barat
Fasilitas HPC
Master Node (4 Node)
Prosesor: 2 x 8 core
Intel Xeon E5 Family
Memori: 128 GB
Storage: 24 TB
GPU Node (20 Node)
- Prosesor: 2 x 4 core
Intel Xeon E5 Family
- Memori: 8 - 16 GB
- Storage: 500 GB
- GPU Tesla M2075
(488 core)
Basic Node (114 Node)
Prosesor: 2 x 4 core
Intel Xeon E5 Family
Memori: 8 - 16 GB
Storage: 500 GB
High Memory Node (8
Node)
- Prosesor: 2 x 8 core
Intel Xeon E5 Family
- Memori: 256 GB
- Storage: 2 x 300 GB
HPC LIPI @P2Informatika
• Cibinong
− 928 Core
− 3072 GB RAM
− 103 TB Space
−
• Bandung
− 336 Core
− 560 GB RAM
− 67 TB Space
HPC Cibinong
HPC LIPI @P2Informatika
Apa itu Komputasi paralel?
Komputasi Serial:
Komputer desktop
konvensional memiliki Central
Processing Unit tunggal (CPU)
dan komputasi dilakukan
dengan memecah problem
menjadi serangkaian perintah
diskrit.
Perintah di eksekusi oleh
komputer satu persatu, karena
hanya satu perintah yang
dapat dijalankan dalam satu
waktu.
Apa itu Komputasi paralel?
Komputasi Paralel:
Sedangkan Hardware High
Performance Computing terdiri dari
beberapa CPU dan dikonfigurasi untuk
menjalankan perhitungan paralel.
Setiap problem harus dipecah menjadi
bagian-bagian diskrit yang dapat
dikomputasi secara konkuren
Setiap bagian kemudian dipecah
menjadi serangkaian perintah.
Perintah tersebut dikomputasi secara
simultan di CPU yang berbeda.
Apa itu Komputasi paralel?
•Masalah komputasi yang akan dijalankan di HPC harus
dapat:
•Dipisah-pisahkan menjadi potongan-potongan diskrit
pekerjaan yang dapat diselesaikan secara bersamaan;
•Mengeksekusi beberapa instruksi program pada setiap saat
dalam waktu;
•Diselesaikan dalam waktu kurang dengan beberapa sumber
komputasi daripada dengan sumber daya komputasi tunggal.
•sumber daya komputasi biasanya:
•komputer dengan prosesor / core banyak
•Sejumlah komputer yang terhubung dengan jaringan
Apa itu Komputasi paralel?
Jika anda memiliki aplikasi komputasi favorit
Satu prosesor akan memberi hasil dalam N jam.
Mengapa tidak menggunakan N prosesor
-- dan mendapat hasil hanya dalam 1 jam?
Konsepnya :
Parallelism = menggunakan beberapa prosesor pada sebuah problem
► Dua komponen parallel programming
Komputasi
Komunikasi
A Computer Cluster
Regular PC
A computer cluster
Front-end node
1 CPU
1 or 2 Hard disks
Some memory
512MB,.. 1GB,..
Compute-0-0
Compute-0-1
Compute-0-2
Parallel computing is computing by committee
komputasi paralel: penggunaan beberapa komputer atau
prosesor yang bekerja bersama-sama dalam tugas bersama.
Setiap prosesor bekerja pada bagiannya masing-masing dari
problem
Prosesor diperbolehkan untuk bertukar informasi dengan prosesor
lainnya
Grid of Problem to be solved
exchange
CPU #2 works on this area
of the problem
exchange
CPU #3 works on this area
of the problem
exchange
exchange
y
CPU #1 works on this area
of the problem
CPU #4 works on this area
of the problem
x
Mengapa menggunakan HPC?
Data + Simulation = Innovation
“Calculation will increasingly
replace experimentation in design
of useful materials, catalysts, and
drugs, leading to much greater
efficiency and new opportunities
for creativity”
-- Frank Wilczek, Physics in 100 Years
Data + Simulation = Innovation
Mengapa
menggunakan
HPC?
Mengapa menggunakan HPC?
Dunia nyata parallel secara masiv:
Di dunia nyata, banyak peristiwa yang kompleks dan saling terkait yang terjadi
pada saat yang sama, namun dalam urutan temporal.
Dibandingkan dengan komputasi serial, komputasi paralel jauh lebih cocok untuk
pemodelan, simulasi dan pemahaman fenomena dunia nyata yang kompleks.
Misalnya, bayangkan melakukan pemodelan hal-2 berikut secara
serial:
Mengapa menggunakan HPC?
Misalnya, bayangkan melakukan pemodelan hal-2 berikut
secara serial:
Menghemat waktu dan/atau uang
• Secara teori, menggunakan lebih banyak sumber daya pada
sebuah pekerjaan akan mempersingkat waktu penyelesaian,
dengan potensi penghematan biaya.
• Komputer paralel dapat dibangun dari komponen komoditas
yang murah.
Menghemat
waktu
dan/atau
uang
Menghemat
waktu
dan/atau
uang
MEMECAHKAN MASALAH YANG LEBIH BESAR / KOMPLEKS:
Banyak masalah yang begitu besar dan / atau kompleks yang secara
teknis tidak atau tidak mungkin untuk dipecahkan dengan satu
komputer, terutama mengingat memori komputer yang terbatas.
Contoh:
Mesin pencari / database pengolahan jutaan transaksi setiap detik
“Masalah yang menjadi tantangan besar”
(en.wikipedia.org/wiki/Grand_Challenge) membutuhkan sumber daya komputasi
petaflops dan petabyte.
Grand challenge Problem
Solving grand challenge applications using computer
modeling, simulation and analysis
Life Sciences
CAD/CAM
Aerospace
Digital Biology
E-commerce/anything
Military Applications
Grand challenges Problem
Life Sciene
Life
Sciene
Engine Combustion Research Group
Signal Processing/Quantum Mechanics
Convolution model (stencil)
Matrix computations (eigenvalues…)
Conjugate gradient methods
Normally not very demanding on latency and bandwidth
Some algorithms are embarrassingly parallel
Examples: seismic migration/processing, medical imaging, SETI@Home
Signal Processing Example
Pekerjaan Dilakukan Secara Konkuren
Sebuah sumber daya komputasi tunggal hanya dapat melakukan satu hal pada
suatu waktu.
Beberapa sumber daya komputasi dapat melakukan banyak hal secara
bersamaan.
Contoh: Jaringan Kolaborasi menyediakan tempat global di mana orang-orang dari
seluruh dunia dapat bertemu dan melakukan pekerjaan “secara virtual".
Siapa yang Menggunakan Parallel Computing?
Science dan Engineering :
Secara historis, komputasi paralel telah dianggap “komputasi high end”, dan telah
digunakan untuk memodelkan masalah sulit di banyak bidang ilmu pengetahuan dan
teknik:
Atmosphere, Earth, Environment
Physics - applied, nuclear, particle, condensed matter, high pressure, fusion, photonics
Bioscience, Biotechnology, Genetics
Chemistry, Molecular Sciences
Geology, Seismology
Mechanical Engineering - from prosthetics to spacecraft
Electrical Engineering, Circuit Design, Microelectronics
Computer Science, Mathematics
Defense, Weapons
HPC Applications and Major Industries
42
Finite Element Modeling
Auto/Aero
Fluid Dynamics
Auto/Aero, Consumer Packaged Goods
Mfgs, Process Mfg, Disaster Preparedness
(tsunami)
Imaging
Seismic & Medical
Finance
Banks, Brokerage Houses (Regression
Analysis, Risk, Options Pricing, What if, …)
Molecular Modeling
Complex Problems, Large Datasets, Long Runs
Biotech and Pharmaceuticals
This slide is from Intel presentation “Technologies for Delivering Peak Performance on HPC and Grid Applications”
5 January 2017
43
Divide and Conquer
Says 1 CPU
1,000,000 elements
Numerical processing for 1
element = .1 secs
One computer will take
100,000 secs = 27.7 hrs
Says 100 CPUs
.27 hr ~ 16 mins
5 January 2017
Life Science Problem – an example of Protein
Folding
Take a computing year (in serial mode) to do molecular
dynamics simulation for a protein folding problem
•Excerpted from IBM David Klepacki’s The future of HPC
•Petaflop = a thousand trillion floating point operations per second
5 January 2017
Disaster Preparedness
Project LEAD
Severe Weather prediction
(Tornado) – OU leads.
HPC & Dynamically
adaptation to weather
forecast
Professor Seidel’s LSU CCT
Hurricane Route Prediction
Emergency Preparedness
Show Movie – HPC-enabled
Simulation
5 January 2017
Cancer Gene-mining
Unsuccessful on a uni-processor
Approach
Novel parallel gene-mining algorithms
Input from microarray
Retain accuracy
Significantly speed up (superlinear)
IBM P5 supercomputer (128 node PPC).
Time to run the algorithm, keeping number of nodes fixed
Mesothelioma
Time taken(in secs)
1200
1000
Breast
80
60
Renal
Leukemia
40
800
20
600
0
Prostate
Lung
400
Pancreas
200
0
13
39
65
Number of processors
5 January 2017
Bladder
100
91
Colorectal
Ovary
Lymphoma
Melanoma
OvaMarker based Selection
GeneSetMine based Selection
46
Did you know that Playstation 3 is a
HPC/Supercomputer?
9 cores/CPUs in one chip.
Future gaming software is no longer graphic or multimedia only
This diagram is from an article from IBM Cell processor & compiler challenge
5 January 2017
Global Climate Modeling Problem
48
Problem is to compute:
f(latitude, longitude, elevation, tim
e)
temperature, pressure, humidity,
wind velocity
Approach:
Discretize the domain, e.g., a
measurement point every 10 km
Devise an algorithm to predict
weather at time t+1 given t
• Uses:
- Predict major events, e.g., El Nino
- Use in setting air emissions standards
C Cox
49
Global Climate Modeling Computation
Computational requirements:
To match real-time, need 5x 1011 flops in 60 seconds = 8
Gflop/s
Weather prediction (7 days in 24 hours) 56 Gflop/s
Climate prediction (50 years in 30 days) 4.8 Tflop/s
To use in policy negotiations (50 years in 12 hours) 288
Tflop/s
To double the grid resolution, computation is at least 8x
State of the art models require integration of atmosphere, ocean, sea-ice, land models, plus
possibly carbon cycle, geochemistry and more
Current models are coarser than this
C Cox
50
Heart Simulation
Problem is to compute blood flow in the heart
Approach:
Modeled as an elastic structure in an incompressible fluid.
The “immersed boundary method” due to Peskin and McQueen.
20 years of development in model
Many applications other than the heart: blood clotting, inner
ear, paper making, embryo growth, and others
Use a regularly spaced mesh (set of points) for evaluating the fluid flow
Uses
Current model can be used to design artificial heart valves
Can help in understand effects of disease (leaky valves)
Related projects look at the behavior of the heart during a heart attack
Ultimately: real-time clinical work
C Cox
51
•
•
•
•
Parallel computing: Web searching
Functional parallelism: crawling, indexing, sorting
Parallelism between queries: multiple users
Finding information amidst junk
Preprocessing of the web data set to help find information
• General themes of sifting through large, unstructured data
sets:
C Cox
52
Parallel Programming: Decomposition Techniques
Functional Decomposition (Functional Parallelism)
Decomposing the problem into different tasks which can be distributed
to multiple processors for simultaneous execution
Good to use when there is not static structure or fixed determination of
number of calculations to be performed
Domain Decomposition (Data Parallelism)
Partitioning the problem's data domain and distributing portions to
multiple processors for simultaneous execution
Good to use for problems where:
data is static (e.g. solving large matrix or finite difference or finite element calculations)
dynamic data structure tied to single entity where entity can be separated
domain is fixed but computation within various regions of the domain is dynamic (fluid vortices models)
Combination of functional and domain decomposition
C Cox
Siapa yang Menggunakan Parallel Computing?
Bioscience, Biotechnol
ogy, Genetics
Atmosphere, Earth, En
vironment
Siapa yang Menggunakan Parallel Computing?
Siapa yang Menggunakan Parallel Computing?
Industrial and Commercial
Aplikasi-aplikasi berikut memerlukan pengolahan data dalam jumlah besar dengan cara yang
canggih.
Big Data, databases, data
mining
Oil exploration
Web search engines, web
based business services
Medical imaging and
diagnosis
Pharmaceutical design
Financial and economic modeling
Management of national and multinational corporations
Advanced graphics and virtual
reality, particularly in the
entertainment industry
Networked video and multi-media
technologies
Collaborative work environments
Top Ten Most Powerful Computers http://www.top500.org)
#
Site
1 National Supercomputing Center
in Wuxi China
2 National Super Computer Center
in Guangzhou China
System
Sunway TaihuLight - Sunway MPP, Sunway SW26010 260C
1.45GHz, Sunway NRCPC
Tianhe-2 (MilkyWay-2) - TH-IVB-FEP Cluster, Intel Xeon E5-2692
12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P NUDT
3 DOE/SC/Oak Ridge National
Laboratory US
4 DOE/NNSA/LLNL US
Titan - Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini
interconnect, NVIDIA K20x Cray Inc.
Sequoia - BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM
5 DOE/SC/LBNL/NERSC US
Cori - Cray XC40, Intel Xeon Phi 7250 68C 1.4GHz, Aries
interconnect Cray Inc.
Oakforest-PACS - PRIMERGY CX1640 M1, Intel Xeon Phi 7250
68C 1.4GHz, Intel Omni-Path Fujitsu
K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu
6 Joint Center for Advanced HPC
Japan
7 RIKEN (AICS) Japan
8 Swiss National Supercomputing
Centre (CSCS) Switzerland
9 DOE/SC/Argonne National
Laboratory US
10 DOE/NNSA/LANL/SNL US
Piz Daint - Cray XC50, Xeon E5-2690v3 12C 2.6GHz, Aries
interconnect , NVIDIA Tesla P100 Cray Inc.
Mira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM
Trinity - Cray XC40, Xeon E5-2698v3 16C 2.3GHz, Aries
interconnect Cray Inc.
Rmax
Cores
(TFlop/s)
10,649,600 93,014.6
Rpeak
Power
(TFlop/s)
(kW)
125,435.9 15,371
3,120,000
33,862.7
54,902.4
17,808
560,640
17,590.0
27,112.5
8,209
1,572,864
17,173.2
20,132.7
7,890
622,336
14,014.7
27,880.7
3,939
556,104
13,554.6
24,913.5
2,719
705,024
10,510.0
11,280.4
12,660
206,720
9,779.0
15,988.0
1,312
786,432
8,586.6
10,066.3
3,945
301,056
8,100.9
11,078.9
4,233
Computer Food Chain
Original Food Chain Picture
1984 Computer Food Chain
Mainframe
Mini Computer
Vector Supercomputer
Workstation
PC
1994 Computer Food Chain
(hitting wall soon)
Mini Computer
Workstation
(future is bleak)
Mainframe
Vector Supercomputer
MPP
PC
Computer Food Chain (Now and Future)
CLUSTERING OF COMPUTERS
FOR COLLECTIVE COMPUTING: TRENDS
?
1960
1990
1995+ 2000
Computing Platforms Evolution
Breaking Adm inistrative Barriers
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
2 1 0
?
P
E
R
F
O
R
M
A
N
C
E
2 1 0
21 00
Administrative Barriers
Individual
Group
D epart ment
C ampus
Sta te
N ational
Globe
Inte r Plane t
U niverse
Desktop
(Single Proc es sor?)
SMPs or
SuperC om
puters
Local
Cluster
Enterprise
Cluster/Grid
Global
Cluster/Grid
Inter Plan et
Cluster/Grid ??
Cluster Computer and its
Components
Clustering gained momentum when 3 technologies
converged:
1. Very HP Microprocessors
workstation performance = yesterday supercomputers
2. High speed communication
Comm. between cluster nodes >= between processors in an SMP.
3. Standard tools for parallel/ distributed
computing & their growing popularity.
Parallel architectures (1)
Vector machines
CPU processes multiple data sets
shared memory
advantages: performance, programming difficulties
issues: scalability, price
examples: Cray SV, NEC SX, Athlon3/d, Pentium- IV/SSE/SSE2
Massively parallel processors (MPP)
large number of CPUs
distributed memory
advantages: scalability, price
issues: performance, programming difficulties
examples: ConnectionSystemsCM1 i CM2, GAAP (GeometricArrayParallel Processor)
Parallel architectures (2)
Symmetric Multiple Processing (SMP)
two or more processors
shared memory
advantages: price, performance, programming difficulties
issues: scalability
examples: UltraSparcII, Alpha ES, Generic Itanium, Opteron, Xeon, …
Non Uniform Memory Access (NUMA)
Solving SMP’sscalability issue
hybrid memory model
advantages: scalability
issues: price, performance, programming difficulties
examples: SGI Origin/Altix, Alpha GS, HP Superdome
Clusters
Cluster consists of:
Nodes
Network
OS
Cluster middleware
Standard components
Avoiding expensive proprietary components
Cluster Architecture
Sequential Applications
Sequential Applications
Sequential Applications
Parallel Applications
Parallel Applications
Parallel Applications
Parallel Programming Environment
Cluster Middleware
(Single System Image and Availability Infrastructure)
PC/Workstation
PC/Workstation
PC/Workstation
PC/Workstation
Communications
Communications
Communications
Communications
Software
Software
Software
Software
Network Interface
Hardware
Network Interface
Hardware
Cluster Interconnection Network/Switch
Network Interface
Hardware
Network Interface
Hardware
Cluster Components...1a
Nodes
Multiple High Performance Components:
PCs
Workstations
SMPs (CLUMPS)
Distributed HPC Systems leading to
Metacomputing
They can be based on different
architectures and running difference OS
Cluster Components...1b
Processors
There are many (CISC/RISC/VLIW/Vector..)
Intel: Pentiums, Xeon, Merceed….
Sun: SPARC, ULTRASPARC
HP PA
IBM RS6000/PowerPC
SGI MPIS
Digital Alphas
Integrate Memory, processing and networking into a single
chip
IRAM (CPU & Mem):
(http://iram.cs.berkeley.edu)
Alpha 21366 (CPU, Memory Controller, NI)
Cluster Components…2
OS
State of the art OS:
Linux
(Beowulf)
Microsoft NT (Illinois HPVM)
SUN Solaris (Berkeley NOW)
IBM AIX (IBM SP2)
HP UX
(Illinois - PANDA)
Mach (Microkernel based OS) (CMU)
Cluster Operating Systems (Solaris MC, SCO Unixware, MOSIX
(academic project)
OS gluing layers:
(Berkeley Glunix)
Cluster Components…3
High Performance Networks
Ethernet (10Mbps),
Fast Ethernet (100Mbps),
Gigabit Ethernet (1Gbps)
SCI (Dolphin - MPI- 12micro-sec latency)
ATM
Myrinet (1.2Gbps)
Digital Memory Channel
FDDI
Cluster Components…4
Network Interfaces
Network Interface Card
Myrinet has NIC
User-level access support
Alpha 21364 processor integrates
processing, memory controller, network
interface into a single chip..
Cluster Components…
5 Communication Software
Traditional OS supported facilities (heavy weight due
to protocol processing)..
Sockets (TCP/IP), Pipes, etc.
Light weight protocols (User Level)
Active Messages (Berkeley)
Fast Messages (Illinois)
U-net (Cornell)
XTP (Virginia)
System systems can be built on top of the above
protocols
Cluster Components…6a
Cluster Middleware
Resides Between OS and Applications and
offers in infrastructure for supporting:
Single System Image (SSI)
System Availability (SA)
SSI makes collection appear as single
machine (globalised view of system
resources). Telnet cluster.myinstitute.edu
SA - Check pointing and process migration..
Cluster Components…6b
Middleware Components
Hardware
DEC Memory Channel, DSM (Alewife, DASH) SMP Techniques
OS / Gluing Layers
Solaris MC, Unixware, Glunix)
Applications and Subsystems
System management and electronic forms
Runtime systems (software DSM, PFS etc.)
Resource management and scheduling (RMS):
CODINE, LSF, PBS, NQS, etc.
Cluster Components…7a
Programming environments
Threads (PCs, SMPs, NOW..)
POSIX Threads
Java Threads
MPI
Linux, NT, on many Supercomputers
PVM
Software DSMs (Shmem)
Cluster Components…7b
Development Tools ?
Compilers
C/C++/Java/ ;
Parallel programming with C++ (MIT Press book)
RAD (rapid application development tools)..
GUI based tools for PP modeling
Debuggers
Performance Analysis Tools
Visualization Tools
Cluster Components…8
Applications
Sequential
Parallel / Distributed (Cluster-aware app.)
Grand Challenging applications
Weather Forecasting
Quantum Chemistry
Molecular Biology Modeling
Engineering Analysis (CAD/CAM)
……………….
PDBs, web servers,data-mining
Classification
of Cluster Computer
Clusters Classification..1
Based on Focus (in Market)
High Performance (HP) Clusters
Grand Challenging Applications
High Availability (HA) Clusters
Mission Critical applications
Clusters Classification..2
Based on Workstation/PC Ownership
Dedicated Clusters
Non-dedicated clusters
Adaptive parallel computing
Also called Communal
multiprocessing
Clusters Classification..3
Based on Node Architecture..
Clusters of PCs (CoPs)
Clusters of Workstations (COWs)
Clusters of SMPs (Symmetric
Multiprocessors)(CLUMPs)
Clusters Classification..4
Based on Node OS Type..
Linux Clusters (Beowulf)
Solaris Clusters (Berkeley NOW)
NT Clusters (HPVM)
AIX Clusters (IBM SP2)
SCO/Compaq Clusters (Unixware)
…….Digital VMS Clusters, HP
clusters, ………………..
Clusters Classification..5
Based on node components architecture &
configuration (Processor Arch, Node Type:
PC/Workstation.. & OS: Linux/NT..):
Homogeneous Clusters
All nodes will have similar configuration
Heterogeneous Clusters
Nodes based on different processors
and running different OSes.
Clusters Classification..6a
Dimensions of Scalability & Levels of Clustering
(3)
Network
Public
Enterprise
Metacomputing (GRID)
Technology
(1)
Campus
Department
Workgroup
Uniprocessor
SMP
Cluster
MPP
Platform
(2)
Clusters Classification..6b
Levels of Clustering
Group Clusters (#nodes: 2-99)
(a set of dedicated/non-dedicated computers - mainly connected by SAN like
Myrinet)
Departmental Clusters (#nodes: 99-999)
Organizational Clusters (#nodes: many 100s)
(using ATMs Net)
Internet-wide Clusters=Global Clusters: (#nodes: 1000s to many millions)
Metacomputing
Web-based Computing
Agent Based Computing
Java plays a major in web and agent based computing
Major issues in cluster design
Size Scalability (physical &
application)
Enhanced Availability (failure
management)
Single System Image (look-andfeel of one system)
Fast Communication (networks &
protocols)
Load Balancing (CPU, Net,
Memory, Disk)
Security and Encryption (clusters of
clusters)
Distributed Environment (Social
issues)
Programmability (simple API if
required)
Manageability (admin. And
control)
Applicability (cluster-aware and
non-aware app.)
What Next ??
Clusters of Clusters (HyperClusters)
Global Grid
Interplanetary Grid
Universal Grid??
Clusters of Clusters (HyperClusters)
Cluster 1
Scheduler
Master
Daemon
Submit
Graphical
Control
Clients
Cluster 2
Master
Daemon
Clients
Execution
Daemon
Cluster 3
Scheduler
Master
Daemon
Submit
Graphical
Control
Scheduler
Submit
Graphical
Control
LAN/WAN
Clients
Execution
Daemon
Execution
Daemon
Towards Grid Computing….
What is Grid ?
An infrastructure that couples
Computers (PCs, workstations, clusters, traditional
supercomputers, and even laptops, notebooks, mobile
computers, PDA, and so on)
Software ? (e.g., renting expensive special purpose
applications on demand)
Databases (e.g., transparent access to human genome
database)
Special Instruments (e.g., radio telescope--SETI@Home
Searching for Life in galaxy, Austrophysics@Swinburne for
pulsars)
People (may be even animals who knows ?)
across the local/wide-area networks (enterprise, organisations, or Internet)
and presents them as an unified integrated (single) resource.
Conceptual view of the Grid
Leading to Portal (Super)Computing
http://www.sun.com/hpc/
Grid Application-Drivers
Old and New applications getting enabled due to
coupling of computers, databases, instruments, people,
etc:
(distributed) Supercomputing
Collaborative engineering
high-throughput computing
large scale simulation & parameter studies
Remote software access / Renting Software
Data-intensive computing
On-demand computing
Grid Components
Applications and Portals
Scientific
Engineering
Collaboration
…
Prob. Solving Env.
Development Environments and Tools
Languages
Libraries
Debuggers
Monitoring
Resource Brokers
Web enabled Apps
…
Distributed Resources Coupling Services
Comm.
Sign on & Security
Information
Process
Data Access
Web tools
…
QoS
Grid Apps.
Grid Tools
Grid Middleware
Local Resource Managers
Operating Systems
Computers
Queuing Systems
Clusters
Libraries & App Kernels
Networked Resources across
Organisations
Storage Systems
Data Sources
…
…
TCP/IP & UDP
Scientific Instruments
Grid Fabric
Many GRID Projects and Initiatives
Europe
USA
Globus
Legion
JAVELIN
AppLes
NASA IPG
Condor
Harness
NetSolve
NCSA Workbench
WebFlow
EveryWhere
and many more...
UNICORE
MOL
METODIS
Globe
Poznan
Metacomputing
CERN Data Grid
MetaMPI
DAS
JaWS
and many more...
Australia
Nimrod/G
EcoGrid and GRACE
DISCWorld
PUBLIC FORUMS
Computing Portals
Grid Forum
European Grid Forum
IEEE TFCC!
GRID’2000 and more.
Public Grid Initiatives
Distributed.net
SETI@Home
Compute Power Grid
Japan
Ninf
Bricks
and many more...
Literature on Cluster
Computing
Reading Resources..1
Internet & WWW
Computer Architecture:
http://www.cs.wisc.edu/~arch/www/
Linux Parallel Procesing
http://yara.ecn.purdue.edu/~pplinux/Sites/
Solaris-MC
http://www.sunlabs.com/research/solaris-mc
Microprocessors: Recent Advances
http://www.microprocessor.sscc.ru
Beowulf:
http://www.beowulf.org
Metacomputing
http://www.sis.port.ac.uk/~mab/Metacomputing/
Reading Resources..2
Books
In Search of Cluster
by G.Pfister, Prentice Hall (2ed), 98
High Performance Cluster Computing
Volume1: Architectures and Systems
Volume2: Programming and Applications
Edited by Rajkumar Buyya, Prentice Hall, NJ, USA.
Scalable Parallel Computing
by K Hwang & Zhu, McGraw Hill,98
Cluster Computing Forum
IEEE Task Force on Cluster Computing
(TFCC)
http://www.ieeetfcc.org
TFCC Activities...
Network Technologies
OS Technologies
Parallel I/O
Programming Environments
Java Technologies
Algorithms and Applications
>Analysis and Profiling
Storage Technologies
High Throughput Computing
TFCC Activities...
High Availability
Single System Image
Performance Evaluation
Software Engineering
Education
Newsletter
Industrial Wing
TFCC Regional Activities
All the above have there own pages, see pointers from:
http://www.ieeetfcc.org
TFCC Activities...
Mailing list, Workshops, Conferences, Tutorials, Web-resources etc.
Resources for introducing subject in senior undergraduate and
graduate levels.
Tutorials/Workshops at IEEE Chapters..
….. and so on.
FREE MEMBERSHIP, please join!
Visit TFCC Page for more details:
http://www.ieeetfcc.org (updated daily!).
TERIMA KASIH