0% found this document useful (0 votes)

5 views

1.Introduction

The document outlines the logistics and grading policy for the Parallel Computing course (CS 633) taught by Preeti Malakar at IIT Kanpur, including class hours, office hours, and communication protocols. It details the structure of assignments, attendance requirements, and the importance of academic integrity, particularly regarding plagiarism and the use of AI tools. The document also introduces key concepts in parallel computing, including multicore systems, performance measures, and programming models.

Uploaded by

1none2none3

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

1.Introduction

Uploaded by

1none2none3

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Parallel Computing (CS 633)

January 6, 2025

Preeti Malakar
[email protected]
Logistics
• Class hours: MW 3:30 – 5:00 PM (L16)
• Office hour: W 5:00 – 6:00 PM (KD 221)
• https://www.cse.iitk.ac.in/users/cs633/2024-25-2
– Lectures will be uploaded after every class
• Extra class/quiz/doubts: Saturday 11 AM – 12 PM
• Announcements/uploads on
– MooKIT
– Course email alias
• Email to the instructor should always be prefixed with
[CS633] in the subject 2
Switch OFF All Devices

3
4
Grading Policy
75% attendance is
compulsory for this Participate actively in class

course

5
Lectures
• Lecture slides are pointers for the topic
– They won’t be as verbose as a book!
• In case you miss a class, please ensure you are
up to date with the lecture content
– Either ask your friend
– Or, ask the instructor (Saturday class)

6
Assignment

• One programming assignment in C

• In a group (group size = 4 or 5)
– Send group member information by Jan 14 via
Google forms (link will be shared on Jan 8)
– Include clearly names, roll numbers, IITK
email-ids
• Mode of submission will be explained in due
time
7
Assignment

• Timeline: Early February to Early March

• Credit for early submission
• Penalty for late submission
• Cannot be completed in a day!
• Discussion is NOT allowed outside your group
– You are responsible to maintain your code
and report within your group only

8
Plagiarism

Plagiarism will NOT be tolerated

Use of AI tools is NOT allowed

9
Lecture 1

Introduction
Multicore Era
CPU
Intel 4004
(1971)
Single core
single chip

Single core Hydra Multiple cores

(2000)
multiple chips single chip
Cray X-MP IBM POWER4
(1982) (2001)
Multiple cores
multiple chips

11
Moore’s Law (1965)
Number of transistors in a chip doubles every 18 months

[Source: Wikipedia]
“However, it must be programmed with a more complicated parallel programming
12
model to obtain maximum performance.”
Trends

[Source: M. Frans Kaashoek, MIT]

13
14
top500.org (Nov’24)

~ $600 million
~ 7300 sq. ft.
~ 22 MW power
~ 23000 L water
15
Top #1 Supercomputer
https://www.top500.org/resources/top-systems/

16
green500.org (Nov’23)

Metric of interest: Performance per Watt 17

18
https://hpl-mxp.org/

19
Making of a Supercomputer

Source: energy.gov 20
Greenest Data Centre?

Source: MIT TR 06/19

21
“The 149,000 square
foot facility built on a
hillside overlooking the
UC Berkeley campus
and San Francisco Bay
will house one of the
most energy-efficient
computing centers
anywhere, tapping into
the region’s mild
climate to cool the
supercomputers at the
National Energy
Research Scientific
Computing Center
(NERSC) and eliminating
the need for
mechanical cooling. ”

https://www.science.org/content/article/climate-change-threatens-supercomputers 22
Top Supercomputers from India (Nov’23)

23
2024…

24
Supercomputing in India [topsc.cdacb.in, Jul’24]

25
Source: www.iitk.ac.in
26
Credit: Ashish Kuvelkar, CDAC
27
National Supercomputing Mission Sites

28
Big Compute

29
Massively Parallel Codes

Climate simulation of Earth [Credit: NASA]

30
Discretization

Gridded mesh for a global model [Credit: Tompkins, ICTP]

31
Numerical Weather Models

• Use numerical methods to solve equations

that govern atmospheric processes
• Are based on fluid dynamics and depend on
observations of meteorological variables
• Are used to obtain nowcast/forecast

32
Massively Parallel Simulations

Self-healing material simulation

[Nomura et al., “Nanocarbon synthesis by high-temperature
oxidation of nanoparticles”, Scientific Reports, 2016] 33
Massively Parallel Analysis

[Nomura et al., “Nanocarbon synthesis by high-temperature

oxidation of nanoparticles”, Scientific Reports, 2016]
34
Massively Parallel Codes

Cosmological simulation [Credit: ANL]

35
Massively Parallel Analysis
Virgo Consortium

36
Computational Science

[Source: Culler, Singh and Gupta] 37

Big Data

38
Output Data
10 PB / year

High-
2 PB / simulation
energy
Scaled to 786K cores on Mira
physics
Higgs boson simulation
Source: CERN
240 TB / simulation

Cosmology
Q Continuum simulation
Source: Salman Habib et al.

Climate/weather
Hurricane simulation
Source: NASA 39
Input Data

[Credit: World Meteorological Organization]

40
System Architecture Trends

[Credit: Pavan Balaji@ATPESC’17] 41

I/O trends

NERSC I/O trends [Credit: www.nersc.gov]

42
Compute vs. I/O trends
I/O VS. FLOPS FOR #1 SUPERCOMPUTER IN TOP500 LIST
1.00E-03

1.00E-04
Byte/FLOP

1.00E-05

1.00E-06
1997 2001 2004 2008 2010 2011 2013 2015 2018

43
Why Parallel?

A*
20 hours

2 hours
Not really
44
Parallelism
A parallel computer is a collection of processing
elements that communicate and cooperate to solve
large problems fast.

– Almasi and Gottlieb (1989)

45
Speedup
Example – Sum of squares of N numbers
Serial Parallel

for i = 1 to N for i = 1 to N/P

sum += a[i] * a[i] sum += a[i] * a[i]
collate result

O(N) O(N/P) +
Communication time
46
Performance Measure
• Speedup
Time ( 1 processor)
SP =
Time ( P processors)

• Efficiency
SP
EP =
P

47
Parallel Performance (Parallel Sum)
Parallel efficiency of summing 10^7 doubles

#Processes Time (sec) Speedup Efficiency

1 0.025 1 1.00
2 0.013 1.9 0.95
4 0.010 2.5 0.63
8 0.009 2.8 0.35
12 0.007 3.6 0.30

48
Ideal Speedup
Speedup Linear
Superlinear

Sublinear

Processors
49
Issue – Scalability

[Source: M. Frans Kaashoek, MIT]

50
Scalability Bottleneck

Performance of weather simulation application

51
Scalability and Performance

52
C vs. Python Parallel Performance

Performance Analysis of C and Python Parallel Implementations on a Multicore System Using Particle
Simulation, 2024 53
Parallelism
A parallel computer is a collection of processing
elements that communicate and cooperate to solve
large problems fast.

– Almasi and Gottlieb (1989)

54
Distributed Memory Systems

• Networked systems
Node • Distributed memory
• Local memory
• Remote memory
• Parallel
Codefile system

Cluster
55
Parallel Programming Models
Libraries MPI, TBB, Pthread, OpenMP, …
New languages Haskell, X10, Chapel, …
Extensions Coarray Fortran, UPC, Cilk, OpenCL, …

• Shared memory
– OpenMP, Pthreads, …
• Distributed memory
– MPI, UPC, …
• Hybrid
– MPI + OpenMP
56
This course …

57
Large-scale Parallel Computing

Message Parallel
passing algorithms

Designing Performance
parallel codes analysis

58
Message Passing Paradigm

• Point-to-point (P2P) communications

• Collective communications
• Algorithms
• Performance

59
Profiling

60
Parallel I/O
NOT SHARED

2 GB/s SHARED
BRIDGE NODES

4 GB/s

IB NETWORK

128:1

Compute node rack I/O nodes GPFS filesystem

11
Job Scheduling

Wikipedia

NODES USERS

JOBS

Example of a real supercomputer activity

- Argonne National Laboratory Theta jobs
62
Supercomputer Activity

A graphical representation of all jobs running on the supercomputer

63
Parallel Deep Learning

Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis

64
Reference Material

• DE Culler, A Gupta and JP Singh, Parallel Computer Architecture:

A Hardware/Software Approach Morgan-Kaufmann, 1998.
• A Grama, A Gupta, G Karypis, and V Kumar, Introduction to
Parallel Computing. 2nd Ed., Addison-Wesley, 2003.
• Marc Snir, Steve W. Otto, Steven Huss-Lederman, David W.
Walker and Jack Dongarra, MPI - The Complete Reference,
Second Edition, Volume 1, The MPI Core.
• Bill Gropp, Using MPI, Third Edition, The MIT Press, 2014.
• Research papers

Process and Mechanical Sizing Calculation For Filter Separator
No ratings yet
Process and Mechanical Sizing Calculation For Filter Separator
8 pages
36 Planters Development Bank v. Lopez
0% (1)
36 Planters Development Bank v. Lopez
1 page
1 Introduction
No ratings yet
1 Introduction
58 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 03-Aug-2021 Lecture1-Course Introduction
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 03-Aug-2021 Lecture1-Course Introduction
39 pages
Lec1 Introduction
No ratings yet
Lec1 Introduction
23 pages
01 Whyparallelism
No ratings yet
01 Whyparallelism
39 pages
Week1-Parallel-and-Distributed-Computing
No ratings yet
Week1-Parallel-and-Distributed-Computing
55 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
W1 Intro.4u
No ratings yet
W1 Intro.4u
7 pages
CSCE569 Parallel Computing: TTH 03:30AM-04:45PM Dr. Jianjun Hu
No ratings yet
CSCE569 Parallel Computing: TTH 03:30AM-04:45PM Dr. Jianjun Hu
37 pages
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
No ratings yet
Module 1: Parallelism Fundamentals Week 1 Learning Outcomes
8 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
SMM Cap1
No ratings yet
SMM Cap1
101 pages
W3C1 Principles of Parallel Computing
No ratings yet
W3C1 Principles of Parallel Computing
28 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
CS4230 Parallel Programming: Mary Hall August 21, 2012
No ratings yet
CS4230 Parallel Programming: Mary Hall August 21, 2012
17 pages
Overview of Parallel Computing: Shawn T. Brown
No ratings yet
Overview of Parallel Computing: Shawn T. Brown
46 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Topic 1 2024
No ratings yet
Topic 1 2024
41 pages
Lecture Parallel Computing
No ratings yet
Lecture Parallel Computing
6 pages
Instant ebooks textbook (Ebook) Parallel Computers Architecture and Programming by V. Rajaraman, C. Siva Ram Murthy ISBN 9788120352629, 8120352629 download all chapters
100% (10)
Instant ebooks textbook (Ebook) Parallel Computers Architecture and Programming by V. Rajaraman, C. Siva Ram Murthy ISBN 9788120352629, 8120352629 download all chapters
65 pages
Parallel Computing Main
No ratings yet
Parallel Computing Main
47 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
Full Parallel Computers Architecture and Programming V. Rajaraman PDF All Chapters
100% (4)
Full Parallel Computers Architecture and Programming V. Rajaraman PDF All Chapters
62 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
34 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
Khaitan PSERC Webinar HPC Mar 2013 Slides
No ratings yet
Khaitan PSERC Webinar HPC Mar 2013 Slides
52 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
Introduction To Parallel Computing
No ratings yet
Introduction To Parallel Computing
38 pages
Comp422 534 2020 Lecture1 Introduction
No ratings yet
Comp422 534 2020 Lecture1 Introduction
49 pages
Week - 01 - Lec1 03 03 2021
No ratings yet
Week - 01 - Lec1 03 03 2021
19 pages
CS-3006_2_PDC_Overview_compressed
No ratings yet
CS-3006_2_PDC_Overview_compressed
107 pages
High Performance Computing Unit 1-2
No ratings yet
High Performance Computing Unit 1-2
60 pages
Comp422 2011 Lecture1 Introduction
No ratings yet
Comp422 2011 Lecture1 Introduction
50 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
Lecture 9
No ratings yet
Lecture 9
72 pages
23553
No ratings yet
23553
56 pages
Multi-Core Programming - Increasing Performance Through Software Multi-Threading
No ratings yet
Multi-Core Programming - Increasing Performance Through Software Multi-Threading
11 pages
Parallel Computing Terminology
No ratings yet
Parallel Computing Terminology
11 pages
CS4961 Parallel Programming: Course Details
No ratings yet
CS4961 Parallel Programming: Course Details
7 pages
Parallel Computers Architecture and Programming V. Rajaraman download
100% (1)
Parallel Computers Architecture and Programming V. Rajaraman download
54 pages
Introduction To Computing
No ratings yet
Introduction To Computing
6 pages
15-418: Parallel Computer Architecture and Programming Spring 2011 Syllabus
No ratings yet
15-418: Parallel Computer Architecture and Programming Spring 2011 Syllabus
4 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
1. introduction
No ratings yet
1. introduction
17 pages
An Approach To Parallel Processing: Yashraj Rai Puja Padiya
No ratings yet
An Approach To Parallel Processing: Yashraj Rai Puja Padiya
3 pages
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
100% (2)
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
506 pages
Mscs6060 Parallel and Distributed Systems
No ratings yet
Mscs6060 Parallel and Distributed Systems
50 pages
Handbook HPC 23-24
No ratings yet
Handbook HPC 23-24
18 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
[Ebooks PDF] download Parallel Computers Architecture and Programming V. Rajaraman full chapters
100% (1)
[Ebooks PDF] download Parallel Computers Architecture and Programming V. Rajaraman full chapters
47 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
Parallel Processor Computing Unit 1
No ratings yet
Parallel Processor Computing Unit 1
10 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
CS 258 Parallel Computer Architecture: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
No ratings yet
CS 258 Parallel Computer Architecture: CS 258, Spring 99 David E. Culler Computer Science Division U.C. Berkeley
44 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
134 pages
Supercomputers
From Everand
Supercomputers
Mia Wright
No ratings yet
The Case for Pandora: Aerospace and Astronautics
From Everand
The Case for Pandora: Aerospace and Astronautics
James Essig
No ratings yet
Non Judicial Stamp Paper: Rs.10 Rs.10
No ratings yet
Non Judicial Stamp Paper: Rs.10 Rs.10
3 pages
99 Series - ANS
No ratings yet
99 Series - ANS
7 pages
Unit-1 Introduction to Software Project Management-slides(0)
No ratings yet
Unit-1 Introduction to Software Project Management-slides(0)
26 pages
Analisis Financiero Refineria Cartagena
No ratings yet
Analisis Financiero Refineria Cartagena
9 pages
Class 1+2
No ratings yet
Class 1+2
11 pages
Triggers
100% (1)
Triggers
4 pages
Write Test 8° Año NAME: - I. Cambia Cada Oración Al Pasado Simple
No ratings yet
Write Test 8° Año NAME: - I. Cambia Cada Oración Al Pasado Simple
12 pages
MYP3 SA Crt AC_ (003)
No ratings yet
MYP3 SA Crt AC_ (003)
8 pages
133 Circular 2023
No ratings yet
133 Circular 2023
8 pages
Digital Metamorphosis: 1. Themes in Details
No ratings yet
Digital Metamorphosis: 1. Themes in Details
8 pages
Shopify WL in Pakistan by Sheikh Daniyal
No ratings yet
Shopify WL in Pakistan by Sheikh Daniyal
3 pages
Ecommerce Sales Dashboard (Rubina Jamadar)
No ratings yet
Ecommerce Sales Dashboard (Rubina Jamadar)
12 pages
AIML PPT[1]
No ratings yet
AIML PPT[1]
13 pages
Canam Consultancy Hyderabad
No ratings yet
Canam Consultancy Hyderabad
1 page
Journal Ledger and Trial Balence
No ratings yet
Journal Ledger and Trial Balence
14 pages
A Rule of Thumb Is That
No ratings yet
A Rule of Thumb Is That
4 pages
Roshan Rai Resume SRE DevOps-1
No ratings yet
Roshan Rai Resume SRE DevOps-1
1 page
Asutosh Rath - Sr. TA Specialist
No ratings yet
Asutosh Rath - Sr. TA Specialist
2 pages
Contract II 2 Project
No ratings yet
Contract II 2 Project
18 pages
Lesson 5 Money Matters
No ratings yet
Lesson 5 Money Matters
1 page
Docslide - Us Mar 102
100% (1)
Docslide - Us Mar 102
460 pages
CSC Ssce Practical Examination
No ratings yet
CSC Ssce Practical Examination
7 pages
Web Spoofing Documentation
100% (2)
Web Spoofing Documentation
22 pages
Chapter-1: Smart Irrigation System
No ratings yet
Chapter-1: Smart Irrigation System
22 pages
SD Manual
No ratings yet
SD Manual
29 pages
OB-56-Chapter 02 - INP3004-Research Methods in Industrial-Organizational Psychology
No ratings yet
OB-56-Chapter 02 - INP3004-Research Methods in Industrial-Organizational Psychology
40 pages
IKEA and DELL Case Study
No ratings yet
IKEA and DELL Case Study
2 pages
(Ebook) Get It, Set It, Move It, Prove It: 60 Ways To Get Real Results In Your Organization by Mark Graham Brown ISBN 9780203519837, 0203519833 - Download the ebook today and own the complete content
100% (1)
(Ebook) Get It, Set It, Move It, Prove It: 60 Ways To Get Real Results In Your Organization by Mark Graham Brown ISBN 9780203519837, 0203519833 - Download the ebook today and own the complete content
49 pages

1.Introduction

Uploaded by

1.Introduction

Uploaded by

Parallel Computing (CS 633)

• One programming assignment in C

• Timeline: Early February to Early March

Plagiarism will NOT be tolerated

Single core Hydra Multiple cores

[Source: M. Frans Kaashoek, MIT]

Metric of interest: Performance per Watt 17

Source: MIT TR 06/19

Climate simulation of Earth [Credit: NASA]

Gridded mesh for a global model [Credit: Tompkins, ICTP]

• Use numerical methods to solve equations

Self-healing material simulation

[Nomura et al., “Nanocarbon synthesis by high-temperature

Cosmological simulation [Credit: ANL]

[Source: Culler, Singh and Gupta] 37

[Credit: World Meteorological Organization]

[Credit: Pavan Balaji@ATPESC’17] 41

NERSC I/O trends [Credit: www.nersc.gov]

– Almasi and Gottlieb (1989)

for i = 1 to N for i = 1 to N/P

#Processes Time (sec) Speedup Efficiency

[Source: M. Frans Kaashoek, MIT]

Performance of weather simulation application

– Almasi and Gottlieb (1989)

• Point-to-point (P2P) communications

Compute node rack I/O nodes GPFS filesystem

Example of a real supercomputer activity

A graphical representation of all jobs running on the supercomputer

Demystifying Parallel and Distributed Deep Learning: An In-depth Concurrency Analysis

• DE Culler, A Gupta and JP Singh, Parallel Computer Architecture:

You might also like