1.Introduction
1.Introduction
January 6, 2025
Preeti Malakar
[email protected]
Logistics
• Class hours: MW 3:30 – 5:00 PM (L16)
• Office hour: W 5:00 – 6:00 PM (KD 221)
• https://www.cse.iitk.ac.in/users/cs633/2024-25-2
– Lectures will be uploaded after every class
• Extra class/quiz/doubts: Saturday 11 AM – 12 PM
• Announcements/uploads on
– MooKIT
– Course email alias
• Email to the instructor should always be prefixed with
[CS633] in the subject 2
Switch OFF All Devices
3
4
Grading Policy
75% attendance is
compulsory for this Participate actively in class
course
5
Lectures
• Lecture slides are pointers for the topic
– They won’t be as verbose as a book!
• In case you miss a class, please ensure you are
up to date with the lecture content
– Either ask your friend
– Or, ask the instructor (Saturday class)
6
Assignment
8
Plagiarism
9
Lecture 1
Introduction
Multicore Era
CPU
Intel 4004
(1971)
Single core
single chip
11
Moore’s Law (1965)
Number of transistors in a chip doubles every 18 months
[Source: Wikipedia]
“However, it must be programmed with a more complicated parallel programming
12
model to obtain maximum performance.”
Trends
~ $600 million
~ 7300 sq. ft.
~ 22 MW power
~ 23000 L water
15
Top #1 Supercomputer
https://www.top500.org/resources/top-systems/
16
green500.org (Nov’23)
19
Making of a Supercomputer
Source: energy.gov 20
Greenest Data Centre?
21
“The 149,000 square
foot facility built on a
hillside overlooking the
UC Berkeley campus
and San Francisco Bay
will house one of the
most energy-efficient
computing centers
anywhere, tapping into
the region’s mild
climate to cool the
supercomputers at the
National Energy
Research Scientific
Computing Center
(NERSC) and eliminating
the need for
mechanical cooling. ”
https://www.science.org/content/article/climate-change-threatens-supercomputers 22
Top Supercomputers from India (Nov’23)
23
2024…
24
Supercomputing in India [topsc.cdacb.in, Jul’24]
25
Source: www.iitk.ac.in
26
Credit: Ashish Kuvelkar, CDAC
27
National Supercomputing Mission Sites
28
Big Compute
29
Massively Parallel Codes
31
Numerical Weather Models
32
Massively Parallel Simulations
36
Computational Science
38
Output Data
10 PB / year
High-
2 PB / simulation
energy
Scaled to 786K cores on Mira
physics
Higgs boson simulation
Source: CERN
240 TB / simulation
Cosmology
Q Continuum simulation
Source: Salman Habib et al.
Climate/weather
Hurricane simulation
Source: NASA 39
Input Data
1.00E-04
Byte/FLOP
1.00E-05
1.00E-06
1997 2001 2004 2008 2010 2011 2013 2015 2018
43
Why Parallel?
A*
20 hours
2 hours
Not really
44
Parallelism
A parallel computer is a collection of processing
elements that communicate and cooperate to solve
large problems fast.
45
Speedup
Example – Sum of squares of N numbers
Serial Parallel
O(N) O(N/P) +
Communication time
46
Performance Measure
• Speedup
Time ( 1 processor)
SP =
Time ( P processors)
• Efficiency
SP
EP =
P
47
Parallel Performance (Parallel Sum)
Parallel efficiency of summing 10^7 doubles
48
Ideal Speedup
Speedup Linear
Superlinear
Sublinear
Processors
49
Issue – Scalability
52
C vs. Python Parallel Performance
Performance Analysis of C and Python Parallel Implementations on a Multicore System Using Particle
Simulation, 2024 53
Parallelism
A parallel computer is a collection of processing
elements that communicate and cooperate to solve
large problems fast.
54
Distributed Memory Systems
• Networked systems
Node • Distributed memory
• Local memory
• Remote memory
• Parallel
Codefile system
Cluster
55
Parallel Programming Models
Libraries MPI, TBB, Pthread, OpenMP, …
New languages Haskell, X10, Chapel, …
Extensions Coarray Fortran, UPC, Cilk, OpenCL, …
• Shared memory
– OpenMP, Pthreads, …
• Distributed memory
– MPI, UPC, …
• Hybrid
– MPI + OpenMP
56
This course …
57
Large-scale Parallel Computing
Message Parallel
passing algorithms
Designing Performance
parallel codes analysis
58
Message Passing Paradigm
59
Profiling
60
Parallel I/O
NOT SHARED
2 GB/s SHARED
BRIDGE NODES
4 GB/s
IB NETWORK
128:1
11
Job Scheduling
Wikipedia
NODES USERS
JOBS
63
Parallel Deep Learning
65