01 Introduction
01 Introduction
Introduction
Vishwesh Jatala
Assistant Professor
Department of CSE
Indian Institute of Technology Bhilai
[email protected]
2023-24 W
1
What is the output?
2
Outline of Today’s Lecture
■ Why?
■ What?
■ How?
3
Moore’s Law
4
Moore’s Law Effect
5
Moore’s Law Effect
6
Parallel Architectures are Everywhere!
7
Parallel Hardwares
Distributed CPUs
Multicores
GPUs
8
Hardware and Software
Core
L1 Cache
Output
L2 Cache
DRAM
Sequential
Single-core CPU
9
Hardware and Software
Core1 Core2
L1 Cache L1 Cache
Core3 Core4
L1 Cache L1 Cache
L2 Cache
Same sequential?
DRAM
Multi-core CPU
10
Professor P
15 questions
300 Answer sheets
11
Professor P’s Teaching Assistants
TA#1
TA#2 TA#3
12
Benefits of Parallel Programming
13
Parallel Programming Applications
Climate Modeling
CFD
14
Parallel Programming Applications
15
Challenges!
16
Example-1
Sequential Execution
Core-0
A[1]=1
A[2]=2
A[3]=3
A[4]=4
17
Example-1
Parallel Execution
18
Example-2
Sequential Execution
Core-0
int count = 0;
A[0]=0
for (int i=0; i<5; i++)
A[i]= count++;
A[1]=1
A[2]=2
A[3]=3
A[4]=4
19
Example-2
int count = 0;
for (int i=0; i<5; i++)
A[i]= count++;
Parallel Execution
20
Challenges:
21
Example-3: Sequential Version
int sum = 0;
for (i = 0; i < n; i++) {
x = f(i);
sum = sum+x;
}
Core-0
1 4 3 9 2 8 5 1 1 6 2 7 2 5 0 4 1 8 6 5 1 2 3 9
sum = 95
22
Example-3: Parallel Version-1
my_sum = 0;
my_first i = . . . ;
my_last i = . . . ;
for (my_i = my_first i; my_i < p_last_i; my_i++) {
my_x = f(i);
my_sum += my_x;
}
1 4 3 9 2 8 5 1 1 6 2 7 2 5 0 4 1 8 6 5 1 2 3 9
my_sum 8 19 7 15 7 13 12 14
23
Example-3: Parallel Version-1
1 4 3 9 2 8 5 1 1 6 2 7 2 5 0 4 1 8 6 5 1 2 3 9
my_sum 8 19 7 15 7 13 12 14
sum 95 (8+19+7+15+7+13+12+14)
24
Example-3: Parallel Version-2
25
Parallel Version-1 or Version-2?
26
Challenges:
Communication
Synchronization
27
Example-4:
int sum = 0;
for (i = 0; i < n; i++) {
x = f(i);
sum = sum+x;
}
28
Challenges:
Communication
Synchronization
Load Balancing
29
What did we learn so far?
Sequential
Parallel Hardwares
Programs are
are Inevitable
Inefficient
Parallelization is
Challenges!
promising
30
Let’s discuss details from next class!
31
Course Outline (Part-1)
❑ Optimizations
■ Case studies
■ Extracting Parallelism from Sequential Programs
Automatically
32
Course Logistics
■ Lecture Hours:
❑ Mon, Tue, Thursday 10:30 am - 11:25 am
33
Course Logistics: Evaluation Scheme
■ Attendance
❑ 0% - 50%: 0 Marks
❑ >50%: Marks will be awarded out of 10 accordingly.
❑ Example:
■ Total sessions: 16
■ #sessions attended = 7 (<50%), marks = 0
■ #sessions attended = 10 (62.5%), marks = 2.5 (2*10/8)
34
Course Logistics: References
35
Course Logistics: Tools
■ Platforms:
❑ Prefer Google Colab for GPUs and CUDA Programming.
❑ A demo session
36
Course Logistics
■ Evaluation Policy:
❑ Acknowledge all the sources
❑ Do not cheat
37
Outcome of the Course?
38
Course Logistics
Office: B-010
39
References
■ https://www.cse.iitd.ac.in/~soham/COL380/page.html
■ https://s3.wp.wsu.edu/uploads/sites/1122/2017/05/6-
9-2017-slides-vFinal.pptx
■ Miscellaneous resources on internet
40
Thank you!
41