Unit 1
Unit 1
UNIT - I
PARALLELISM FUNDAMENTALS AND
ARCHITECTURE
Dr. Vishnu Kumar Kaliappan
Professor, Dept. CSE
[email protected],[email protected]
2 January 2024 1
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
u2pkdvp
2 January 2024 2
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Course Objective
• To Introduce the fundamentals of parallel and distributed computing
architectures and paradigms.
• To develop and execute basic parallel and distributed application using basic
programming models and tools.
2 January 2024 3
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Course Outcome
• CO1: Design and implement distributed computing systems (Apply)
• CO5: Analyze the requirement for programming parallel systems and critically
evaluate the strengths and weaknesses of parallel programming models
(Analyze)
2 January 2024 4
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Parallel computing
• Parallel computing is a
computing architecture
that involves executing
multiple processors on an
application or computation
simultaneously.
2 January 2024 6
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 7
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 14
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 15
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 16
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 17
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 18
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 19
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 20
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 21
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 22
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 23
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 24
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 25
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 26
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 27
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 28
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 29
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 30
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Let see……
2 January 2024 31
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Assignment
• Google……it………
• Find top 10 super computer exist around the globe and compare their
performance. (mention location, flops, etc.,)
• Write for one pager and submit in google classroom tomorrow.
2 January 2024 32
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Motivation
• Development of parallel software has traditionally been thought of as time
and effort intensive.
• largely attributed to the inherent complexity of specifying and coordinating
concurrent tasks, a lack of portable algorithms, standardized environments,
and software development toolkits.
• It takes two years to develop a parallel application, during which time the
underlying hardware and/or software platform has become obsolete.
• This is a result of lack of implicit parallelism as well as other bottlenecks
such as the datapath and the memory.
2 January 2024 33
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 34
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 35
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 36
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Flynn’s Classification
• SISD: SISD is an abbreviation
for Single Instruction and Single
Data Stream. It depicts the
structure of a single computer,
which includes a control unit, a
memory unit, and a processor
unit.
• Like classic Von-Neumann
computers, most conventional
computers utilize the SISD PE = Processing Element, CU = Control Unit, and M = Memory
architecture
➢ Examples: Minicomputers, workstations, and computers from previous generations.
2 January 2024 37
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Flynn’s Classification
• SIMD: SIMD is an abbreviation for
Single Instruction and Multiple Data
Stream. It symbolizes an organization
with a large number of processing
units overseen by a common control
unit.
• SIMD was created with array
processing devices in mind. Vector
processors, on the other hand, can be
included in this category according to
Flynn’s taxonomy. There are
architectures that are not vector
processors but are SIMD
architectures.
2 January 2024 38
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Flynn’s Classification
• MISD: MISD is an abbreviation for
Multiple Instruction and Single Data
stream. Because no real system has
been built using the MISD structure, it
is primarily of theoretical importance.
• Multiple processing units work on a
single data stream in MISD. Each
processing unit works on the data in
its own way, using its own instruction
stream.
Here, M = Memory Modules, P = Processor
Units, and CU = Control Unit
2 January 2024 39
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Flynn’s Classification
• MIMD: MIMD is an abbreviation
for Multiple Instruction and
Multiple Data Stream. All
processors of a parallel computer
may execute distinct instructions
and act on different data at the
same time in this organization.
• Each processor in MIMD has its
own program, and each program
generates an instruction stream
PE = Processing Element, M = Memory
Module, and CU = Control Unit
2 January 2024 40
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Multi-Core Processors
• A multicore processor is an integrated circuit that has two or more
processor cores attached for enhanced performance and reduced
power consumption.
• These processors also enable more efficient simultaneous processing
of multiple tasks, such as with parallel processing and multithreading.
• A dual core setup is similar to having multiple, separate processors
installed on a computer. However, because the two processors are
plugged into the same socket, the connection between them is faster.
• The use of multicore processors or microprocessors is one approach
to boost processor performance without exceeding the practical
limitations of semiconductor design and fabrication.
2 January 2024 41
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Multi-Core Processors
• Multicore processors working concept: The heart of every processor
is an execution engine, also known as a core. The core is designed to
process instructions and data according to the direction of software
programs in the computer's memory.
• Clock speed. One approach was to make the processor's clock faster. The
clock is the "drumbeat" used to synchronize the processing of instructions
and data through the processing engine.
• Hyper-threading. Another approach involved the handling of multiple
instruction threads. Intel calls this hyper-threading. With hyper-threading,
processor cores are designed to handle two separate instruction threads at
the same time.
2 January 2024 42
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Multi-Core Processors
• More chips. The next step was to add processor chips -- or dies -- to the
processor package, which is the physical device that plugs into the motherboard.
A dual-core processor includes two separate processor cores.
Multi-Core Processors
• Types of multicore processors: Different multicore processors often
have different numbers of cores. For example, a quad-core processor
has four cores.
• Core types:
• Homogeneous (symmetric) cores. All of the cores in a homogeneous
multicore processor are of the same type; typically, the core processing units
are general-purpose central processing units that run a single multicore
operating system.
• Heterogeneous (asymmetric) cores. Heterogeneous multicore processors
have a mix of core types that often-run different operating systems and
include graphics processing units.
2 January 2024 44
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Multi-Core Processors
• Number and level of caches. Multicore processors vary in terms of
their instruction and data caches, which are relatively small and fast
pools of local memory.
• How cores are interconnected. Multicore processors also vary in
terms of their bus architectures.
• Isolation. The amount, typically minimal, of in-chip support for the
spatial and temporal isolation of cores:
• Physical isolation ensures that different cores cannot access the same
physical hardware (e.g., memory locations such as caches and RAM).
• Temporal isolation ensures that the execution of software on one core does
not impact the temporal behavior of software running on another core.
2 January 2024 45
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Multi-Core Processors
Multi-Core Processors
2 January 2024 48
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 49
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 50
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 51
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 52
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 53
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
return 0;
}
2 January 2024 55
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 56
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 57
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Vector Processing
• Vector processor is basically a central processing unit that has the
ability to execute the complete vector input in a single instruction.
More specifically we can say, it is a complete unit of hardware
resources that executes a sequential set of similar data items in the
memory using a single instruction.
2 January 2024 58
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 59
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 60
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 61
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
vector processor
• Advantages of Vector Processor
• Vector processor uses vector instructions by which code density of the
instructions can be improved.
• The sequential arrangement of data helps to handle the data by the hardware
in a better way.
• It offers a reduction in instruction bandwidth.
• So, from the above discussion, we can conclude that register to
register architecture is better than memory to memory architecture
because it offers a reduction in vector access time.
2 January 2024 62
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 63
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
2 January 2024 64
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012
U21CS603 – DISTRIBUTED SYSTEMS Dr. Vishnu Kumar Kaliappan
Recap
Motivation
Key Concepts and Challenges
Overview of Parallel computing
Flynn’s Taxonomy
Multi-core processors
Shared vs Distributed memory
Introduction to OpenMP programming
Instruction Level Support for Parallel Programming
SIMD
Vector Processing
GPUs.
End of Unit 1
2 January 2024 65
Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar, “Introduction to Parallel Computing”, Pearson, 2nd 2012