SlideShare a Scribd company logo
PARALLEL
PROCESSING
CONCEPTS
Prof. Shashikant V. Athawale
Assistant Professor | Computer Engineering
Department | AISSMS College of Engineering,
Kennedy Road, Pune , MH, India - 411001
Contents
2
 Introduction to Parallel Computing
 Motivating Parallelism
 Scope of Parallel Computing
 Parallel Programming Platforms
 Implicit Parallelism
 Trends in Microprocessor and Architectures
 Limitations of Memory System Performance
 Dichotomy of Parallel Computing Platforms
 Physical Organization of Parallel Platforms
 Communication Costs in Parallel Machines
 Scalable design principles
 Architectures: N-wide superscalar architectures
 Multi-core architectures.
Introduction to Parallel
Computing
3
A parallel computer is a “Collection of processing
elements that communicate and co-operate to solve large
problems fast”.
Processing of multiple tasks simultaneous on
multiple processor is called parallel processing.
What is Parallel Computing?
Traditionally, software has been written for serial computation:
To be run on a single computer having a single Central Processing Unit (CPU)
What is Parallel Computing?
In the simplest sense, parallel computing is the simultaneous use of
multiple compute resources to solve a computational problem.
Serial Vs Parallel Computing
Fetch/Store
Compute
Fetch/Store
Compute
communicate
Cooperative game
Motivating Parallelism
7
The role of parallelism in accelerating computing
speeds has been recognized for several decades.
Its role in providing multiplicity of datapaths and
increased access to storage elements has been
significant in commercial applications.
The scalable performance and lower cost of parallel
platforms is reflected in the wide variety of applications.
8
Developing parallel hardware and software has traditionally
been time and effort intensive.
If one is to view this in the context of rapidly improving
uniprocessor speeds, one is tempted to question the need for
parallel computing.
This is the result of a number of fundamental physical and
computational limitations.
The emergence of standardized parallel programming
environments, libraries, and hardware have significantly
reduced time to (parallel) solution.
In short
9
1. Overcome limits to serial computing
2. Limits to increase transistor density
3. Limits to data transmission speed
4. Faster turn-around time
5. Solve larger problems
 Parallel computing has great impact on wide range of
applications.
 Commerical
 Scientific
 Turn around time should be minimum
 High performance
 Resource mangement
 Load balencing
 Dynamic libray
 Minimum network congetion and latency
10
Scope of Parallel Computing
Applications
 Commercial computing.
- Weather forecasting
- Remote sensors, Image processing
- Process optimization, operations research.
 Scientific and Engineering application.
- Computational chemistry
- Molecular modelling
- Structure mechanics
 Business application.
- E – Governance
- Medical Imaging
 Internet applications.
- Internet server
- Digital Libraries
11
 The main objective is to provide sufficient
details to programmer to be able to write
efficient code on variety of platform.
 Performance of various parallel
algorithm.
12
Parallel Programming
Platforms
Implicit Parallelism
A programming language is said to be
implicitly parallel if its compiler or interpreter
can recognize opportunities for
parallelization and implement them without
being told to do so.
13
Implicitly parallel programming
language
 Implicitly parallel programming languages
 Microsoft Axum
 MATLAB's M-code
 ZPL
 Laboratory Virtual Instrument Engineering
Workbench (LabVIEW)
 NESL
 SISAL
 High-Performance Fortran (HPF)
14
Dichotomy of Parallel
Computing Platforms
 First explore a dichotomy based on the logical and
physical organization of parallel platforms.
 The logical organization refers to a programmer's
view of the platform while the physical organization
refers to the actual hardware organization of the
platform.
 The two critical components of parallel computing
from a programmer's perspective are ways of
expressing parallel tasks and mechanisms for
specifying interaction between these tasks.
 The former is sometimes also referred to as the
control structure and the latter as the communication
model.
15
Control Structure of Parallel Platforms
16
Parallel tasks can be specified at various levels of granularity.
At the other extreme, individual instructions within a program
can be viewed as parallel tasks. Between these extremes lie a
range of models for specifying the control structure of programs
and the corresponding architectural support for them.
Parallelism from single instruction on multiple processors
Consider the following code segment that adds two vectors:
1 for (i = 0; i < 1000; i++)
2 c[i] = a[i] + b[i];
In this example, various iterations of the loop are independent
of each other; i.e., c[0] = a[0] + b[0]; c[1] = a[1] + b[1];, etc., can all be
executed independently of each other. Consequently, if there is a mechanism for executing the same
instruction, in this case add on all the processors with appropriate data, we
could execute this loop much faster
A typical SIMD architecture (a) and a typical MIMD
architecture (b).
17
Figure A typical SIMD architecture (a) and a typical MIMD architecture (b).
Executing a conditional statement on an SIMD computer
with four processors: (a) the conditional statement; (b) the
execution of the statement in two steps
18
Communication Model of Parallel Platforms
19
Shared-Address-Space Platforms
Typical shared-address-space architectures: (a) Uniform-memory-access
shared-address-space computer; (b) Uniform-memory-access shared-
address-space computer with caches and memories; (c) Non-uniform-
memory-access shared-address-space computer with local memory only.
Message-Passing Platforms
20
The logical machine view of a message-passing platform
consists of p processing nodes.
Instances clustered workstations and non-shared-address-
space multicomputers.
On such platforms, interactions between processes running
on different nodes must be accomplished using messages,
hence the name message passing.
This exchange of messages is used to transfer data, work,
and to synchronize actions among the processes.
In its most general form, message-passing paradigms
support execution of a different program on each of the p
nodes.
Physical Organization of
Parallel Platforms
21
Architecture of an Ideal Parallel Computer
Exclusive-read, exclusive-write (EREW) PRAM. In this class,
access to a memory location is exclusive. No concurrent read or
write operations are allowed.
Concurrent-read, exclusive-write (CREW) PRAM. In this class,
multiple read accesses to a memory location are allowed.
Exclusive-read, concurrent-write (ERCW) PRAM. Multiple write
accesses are allowed to a memory location, but multiple read
accesses are serialized.
Concurrent-read, concurrent-write (CRCW) PRAM. This class
allows multiple read and write accesses to a common memory
location. This is the most powerful PRAM model.
Interconnection Networks for Parallel Computers
▹ Interconnection networks can be classified
as static or dynamic. Static networks consist of point-
to-point communication links among processing nodes
and are also referred to as direct networks. Figure .Classification
of interconnection networks: (a) a static network; and (b) a dynamic network.
22
Network Topology
23
Linear Arrays
Linear arrays: (a) with no wraparound links; (b) with
wraparound link.
Two and three dimensional meshes: (a) 2-D mesh with no
wraparound; (b) 2-D mesh with wraparound link (2-D
torus); and (c) a 3-D mesh with no wraparound.
24
Construction of hypercubes from hypercubes of lower
dimension.
25
Tree-Based Networks
26
Complete binary tree networks: (a) a static tree network;
and (b) a dynamic tree network.
Scalable Design principles
❖ Avoid the single point of failure.
❖ Scale horizontally, not vertically.
❖ Push work as far away from the core as possible.
❖ API first.
❖ Cache everything, always.
❖ Provide as fresh as needed data.
❖ Design for maintenance and automation.
❖ Asynchronous rather than synchronous.
❖ Strive for statelessness.
N-wide superscalar architecture:
❖ Superscalar architecture is called as N-wide architecture
if it supports to fetch and dispatch of n instructions in
every cycle.
Multi-core architectures:
Multi-core architectures:
❖ Many cores fit on the single processor socket.
❖ 2)Also called Chip-Multiprocessor
❖ 3)These cores runs in parallel.
❖ 4)The architecture of a multicore processor enables
❖ communication between all available cores to ensure that
the processing tasks are divided and assigned accurately.
THANKU YOU !!!!
31
Ad

More Related Content

What's hot (20)

multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputers
Pankaj Kumar Jain
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Praveen Kumar
 
Reader/writer problem
Reader/writer problemReader/writer problem
Reader/writer problem
RinkuMonani
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
BOSS Webtech
 
Addressing in networking (IP,MAC,Port addressing)
Addressing in networking (IP,MAC,Port addressing)Addressing in networking (IP,MAC,Port addressing)
Addressing in networking (IP,MAC,Port addressing)
Geethu Jose
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
Adlin Jeena
 
Mobile Network Layer
Mobile Network LayerMobile Network Layer
Mobile Network Layer
Rahul Hada
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
Mazin Alwaaly
 
Amoeba distributed operating System
Amoeba distributed operating SystemAmoeba distributed operating System
Amoeba distributed operating System
Saurabh Gupta
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
Branch prediction
Branch predictionBranch prediction
Branch prediction
Aneesh Raveendran
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
Akhila Prabhakaran
 
Parallel computing persentation
Parallel computing persentationParallel computing persentation
Parallel computing persentation
VIKAS SINGH BHADOURIA
 
Parallel computing
Parallel computingParallel computing
Parallel computing
Vinay Gupta
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
Syed Zaid Irshad
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
Dr Sandeep Kumar Poonia
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris mano
vishnu murthy
 
Inter process communication
Inter process communicationInter process communication
Inter process communication
Mohd Tousif
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
Salvatore La Bua
 
Semophores and it's types
Semophores and it's typesSemophores and it's types
Semophores and it's types
Nishant Joshi
 
multiprocessors and multicomputers
 multiprocessors and multicomputers multiprocessors and multicomputers
multiprocessors and multicomputers
Pankaj Kumar Jain
 
Reader/writer problem
Reader/writer problemReader/writer problem
Reader/writer problem
RinkuMonani
 
Addressing in networking (IP,MAC,Port addressing)
Addressing in networking (IP,MAC,Port addressing)Addressing in networking (IP,MAC,Port addressing)
Addressing in networking (IP,MAC,Port addressing)
Geethu Jose
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
Adlin Jeena
 
Mobile Network Layer
Mobile Network LayerMobile Network Layer
Mobile Network Layer
Rahul Hada
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
Mazin Alwaaly
 
Amoeba distributed operating System
Amoeba distributed operating SystemAmoeba distributed operating System
Amoeba distributed operating System
Saurabh Gupta
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecture
Pankaj Kumar Jain
 
Introduction to Parallel Computing
Introduction to Parallel ComputingIntroduction to Parallel Computing
Introduction to Parallel Computing
Akhila Prabhakaran
 
Parallel computing
Parallel computingParallel computing
Parallel computing
Vinay Gupta
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
Syed Zaid Irshad
 
Computer organisation -morris mano
Computer organisation  -morris manoComputer organisation  -morris mano
Computer organisation -morris mano
vishnu murthy
 
Inter process communication
Inter process communicationInter process communication
Inter process communication
Mohd Tousif
 
Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
Salvatore La Bua
 
Semophores and it's types
Semophores and it's typesSemophores and it's types
Semophores and it's types
Nishant Joshi
 

Similar to Parallel Processing Concepts (20)

5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
MohamedBilal73
 
Ca alternative architecture
Ca alternative architectureCa alternative architecture
Ca alternative architecture
University of Sargodha
 
Par com
Par comPar com
Par com
tttoracle
 
The Concurrency Challenge : Notes
The Concurrency Challenge : NotesThe Concurrency Challenge : Notes
The Concurrency Challenge : Notes
Subhajit Sahu
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
Mehul Patel
 
Chap 1(one) general introduction
Chap 1(one)  general introductionChap 1(one)  general introduction
Chap 1(one) general introduction
Malobe Lottin Cyrille Marcel
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core Processors
CSCJournals
 
distributed-systemsfghjjjijoijioj-chap3.pptx
distributed-systemsfghjjjijoijioj-chap3.pptxdistributed-systemsfghjjjijoijioj-chap3.pptx
distributed-systemsfghjjjijoijioj-chap3.pptx
lencho3d
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAH
Akash M Shah
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
Geoffrey Fox
 
Future prediction-ds
Future prediction-dsFuture prediction-ds
Future prediction-ds
Muhammad Umar Farooq
 
CLUSTER COMPUTING
CLUSTER COMPUTINGCLUSTER COMPUTING
CLUSTER COMPUTING
KITE www.kitecolleges.com
 
Complier design
Complier design Complier design
Complier design
shreeuva
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
Sudarsun Santhiappan
 
High-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolarisHigh-Performance Computing and OpenSolaris
High-Performance Computing and OpenSolaris
José Maria Silveira Neto
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
Dr. A. B. Shinde
 
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSINGHOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
cscpconf
 
Integrating research and e learning in advance computer architecture
Integrating research and e learning in advance computer architectureIntegrating research and e learning in advance computer architecture
Integrating research and e learning in advance computer architecture
MairaAslam3
 
Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore
Geoffrey Fox
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...
Ashley Carter
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
MohamedBilal73
 
The Concurrency Challenge : Notes
The Concurrency Challenge : NotesThe Concurrency Challenge : Notes
The Concurrency Challenge : Notes
Subhajit Sahu
 
Introduction to parallel_computing
Introduction to parallel_computingIntroduction to parallel_computing
Introduction to parallel_computing
Mehul Patel
 
Concurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core ProcessorsConcurrent Matrix Multiplication on Multi-core Processors
Concurrent Matrix Multiplication on Multi-core Processors
CSCJournals
 
distributed-systemsfghjjjijoijioj-chap3.pptx
distributed-systemsfghjjjijoijioj-chap3.pptxdistributed-systemsfghjjjijoijioj-chap3.pptx
distributed-systemsfghjjjijoijioj-chap3.pptx
lencho3d
 
Clustering by AKASHMSHAH
Clustering by AKASHMSHAHClustering by AKASHMSHAH
Clustering by AKASHMSHAH
Akash M Shah
 
Parallel Computing 2007: Overview
Parallel Computing 2007: OverviewParallel Computing 2007: Overview
Parallel Computing 2007: Overview
Geoffrey Fox
 
Complier design
Complier design Complier design
Complier design
shreeuva
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
Dr. A. B. Shinde
 
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSINGHOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
HOMOGENEOUS MULTISTAGE ARCHITECTURE FOR REAL-TIME IMAGE PROCESSING
cscpconf
 
Integrating research and e learning in advance computer architecture
Integrating research and e learning in advance computer architectureIntegrating research and e learning in advance computer architecture
Integrating research and e learning in advance computer architecture
MairaAslam3
 
Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore Linking Programming models between Grids, Web 2.0 and Multicore
Linking Programming models between Grids, Web 2.0 and Multicore
Geoffrey Fox
 
Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...Automatically partitioning packet processing applications for pipelined archi...
Automatically partitioning packet processing applications for pipelined archi...
Ashley Carter
 
Ad

More from Dr Shashikant Athawale (20)

multi threaded and distributed algorithms
multi threaded and distributed algorithms multi threaded and distributed algorithms
multi threaded and distributed algorithms
Dr Shashikant Athawale
 
Amortized analysis
Amortized analysisAmortized analysis
Amortized analysis
Dr Shashikant Athawale
 
Complexity theory
Complexity theory Complexity theory
Complexity theory
Dr Shashikant Athawale
 
Divide and Conquer
Divide and ConquerDivide and Conquer
Divide and Conquer
Dr Shashikant Athawale
 
Model and Design
Model and Design Model and Design
Model and Design
Dr Shashikant Athawale
 
Fundamental of Algorithms
Fundamental of Algorithms Fundamental of Algorithms
Fundamental of Algorithms
Dr Shashikant Athawale
 
CUDA Architecture
CUDA ArchitectureCUDA Architecture
CUDA Architecture
Dr Shashikant Athawale
 
Parallel Algorithms- Sorting and Graph
Parallel Algorithms- Sorting and GraphParallel Algorithms- Sorting and Graph
Parallel Algorithms- Sorting and Graph
Dr Shashikant Athawale
 
Analytical Models of Parallel Programs
Analytical Models of Parallel ProgramsAnalytical Models of Parallel Programs
Analytical Models of Parallel Programs
Dr Shashikant Athawale
 
Basic Communication
Basic CommunicationBasic Communication
Basic Communication
Dr Shashikant Athawale
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
Dr Shashikant Athawale
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
Dr Shashikant Athawale
 
Parallel algorithms
Parallel algorithms Parallel algorithms
Parallel algorithms
Dr Shashikant Athawale
 
Greedy method
Greedy method Greedy method
Greedy method
Dr Shashikant Athawale
 
Divide and conquer
Divide and conquerDivide and conquer
Divide and conquer
Dr Shashikant Athawale
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
Dr Shashikant Athawale
 
Asymptotic notation
Asymptotic notationAsymptotic notation
Asymptotic notation
Dr Shashikant Athawale
 
String matching algorithms
String matching algorithmsString matching algorithms
String matching algorithms
Dr Shashikant Athawale
 
Advanced Wireless Technologies
Advanced Wireless TechnologiesAdvanced Wireless Technologies
Advanced Wireless Technologies
Dr Shashikant Athawale
 
Vo ip
Vo ipVo ip
Vo ip
Dr Shashikant Athawale
 
Ad

Recently uploaded (20)

theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Journal of Soft Computing in Civil Engineering
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
new ppt artificial intelligence historyyy
new ppt artificial intelligence historyyynew ppt artificial intelligence historyyy
new ppt artificial intelligence historyyy
PianoPianist
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 

Parallel Processing Concepts

  • 1. PARALLEL PROCESSING CONCEPTS Prof. Shashikant V. Athawale Assistant Professor | Computer Engineering Department | AISSMS College of Engineering, Kennedy Road, Pune , MH, India - 411001
  • 2. Contents 2  Introduction to Parallel Computing  Motivating Parallelism  Scope of Parallel Computing  Parallel Programming Platforms  Implicit Parallelism  Trends in Microprocessor and Architectures  Limitations of Memory System Performance  Dichotomy of Parallel Computing Platforms  Physical Organization of Parallel Platforms  Communication Costs in Parallel Machines  Scalable design principles  Architectures: N-wide superscalar architectures  Multi-core architectures.
  • 3. Introduction to Parallel Computing 3 A parallel computer is a “Collection of processing elements that communicate and co-operate to solve large problems fast”. Processing of multiple tasks simultaneous on multiple processor is called parallel processing.
  • 4. What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer having a single Central Processing Unit (CPU)
  • 5. What is Parallel Computing? In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem.
  • 6. Serial Vs Parallel Computing Fetch/Store Compute Fetch/Store Compute communicate Cooperative game
  • 7. Motivating Parallelism 7 The role of parallelism in accelerating computing speeds has been recognized for several decades. Its role in providing multiplicity of datapaths and increased access to storage elements has been significant in commercial applications. The scalable performance and lower cost of parallel platforms is reflected in the wide variety of applications.
  • 8. 8 Developing parallel hardware and software has traditionally been time and effort intensive. If one is to view this in the context of rapidly improving uniprocessor speeds, one is tempted to question the need for parallel computing. This is the result of a number of fundamental physical and computational limitations. The emergence of standardized parallel programming environments, libraries, and hardware have significantly reduced time to (parallel) solution.
  • 9. In short 9 1. Overcome limits to serial computing 2. Limits to increase transistor density 3. Limits to data transmission speed 4. Faster turn-around time 5. Solve larger problems
  • 10.  Parallel computing has great impact on wide range of applications.  Commerical  Scientific  Turn around time should be minimum  High performance  Resource mangement  Load balencing  Dynamic libray  Minimum network congetion and latency 10 Scope of Parallel Computing
  • 11. Applications  Commercial computing. - Weather forecasting - Remote sensors, Image processing - Process optimization, operations research.  Scientific and Engineering application. - Computational chemistry - Molecular modelling - Structure mechanics  Business application. - E – Governance - Medical Imaging  Internet applications. - Internet server - Digital Libraries 11
  • 12.  The main objective is to provide sufficient details to programmer to be able to write efficient code on variety of platform.  Performance of various parallel algorithm. 12 Parallel Programming Platforms
  • 13. Implicit Parallelism A programming language is said to be implicitly parallel if its compiler or interpreter can recognize opportunities for parallelization and implement them without being told to do so. 13
  • 14. Implicitly parallel programming language  Implicitly parallel programming languages  Microsoft Axum  MATLAB's M-code  ZPL  Laboratory Virtual Instrument Engineering Workbench (LabVIEW)  NESL  SISAL  High-Performance Fortran (HPF) 14
  • 15. Dichotomy of Parallel Computing Platforms  First explore a dichotomy based on the logical and physical organization of parallel platforms.  The logical organization refers to a programmer's view of the platform while the physical organization refers to the actual hardware organization of the platform.  The two critical components of parallel computing from a programmer's perspective are ways of expressing parallel tasks and mechanisms for specifying interaction between these tasks.  The former is sometimes also referred to as the control structure and the latter as the communication model. 15
  • 16. Control Structure of Parallel Platforms 16 Parallel tasks can be specified at various levels of granularity. At the other extreme, individual instructions within a program can be viewed as parallel tasks. Between these extremes lie a range of models for specifying the control structure of programs and the corresponding architectural support for them. Parallelism from single instruction on multiple processors Consider the following code segment that adds two vectors: 1 for (i = 0; i < 1000; i++) 2 c[i] = a[i] + b[i]; In this example, various iterations of the loop are independent of each other; i.e., c[0] = a[0] + b[0]; c[1] = a[1] + b[1];, etc., can all be executed independently of each other. Consequently, if there is a mechanism for executing the same instruction, in this case add on all the processors with appropriate data, we could execute this loop much faster
  • 17. A typical SIMD architecture (a) and a typical MIMD architecture (b). 17 Figure A typical SIMD architecture (a) and a typical MIMD architecture (b).
  • 18. Executing a conditional statement on an SIMD computer with four processors: (a) the conditional statement; (b) the execution of the statement in two steps 18
  • 19. Communication Model of Parallel Platforms 19 Shared-Address-Space Platforms Typical shared-address-space architectures: (a) Uniform-memory-access shared-address-space computer; (b) Uniform-memory-access shared- address-space computer with caches and memories; (c) Non-uniform- memory-access shared-address-space computer with local memory only.
  • 20. Message-Passing Platforms 20 The logical machine view of a message-passing platform consists of p processing nodes. Instances clustered workstations and non-shared-address- space multicomputers. On such platforms, interactions between processes running on different nodes must be accomplished using messages, hence the name message passing. This exchange of messages is used to transfer data, work, and to synchronize actions among the processes. In its most general form, message-passing paradigms support execution of a different program on each of the p nodes.
  • 21. Physical Organization of Parallel Platforms 21 Architecture of an Ideal Parallel Computer Exclusive-read, exclusive-write (EREW) PRAM. In this class, access to a memory location is exclusive. No concurrent read or write operations are allowed. Concurrent-read, exclusive-write (CREW) PRAM. In this class, multiple read accesses to a memory location are allowed. Exclusive-read, concurrent-write (ERCW) PRAM. Multiple write accesses are allowed to a memory location, but multiple read accesses are serialized. Concurrent-read, concurrent-write (CRCW) PRAM. This class allows multiple read and write accesses to a common memory location. This is the most powerful PRAM model.
  • 22. Interconnection Networks for Parallel Computers ▹ Interconnection networks can be classified as static or dynamic. Static networks consist of point- to-point communication links among processing nodes and are also referred to as direct networks. Figure .Classification of interconnection networks: (a) a static network; and (b) a dynamic network. 22
  • 23. Network Topology 23 Linear Arrays Linear arrays: (a) with no wraparound links; (b) with wraparound link.
  • 24. Two and three dimensional meshes: (a) 2-D mesh with no wraparound; (b) 2-D mesh with wraparound link (2-D torus); and (c) a 3-D mesh with no wraparound. 24
  • 25. Construction of hypercubes from hypercubes of lower dimension. 25
  • 26. Tree-Based Networks 26 Complete binary tree networks: (a) a static tree network; and (b) a dynamic tree network.
  • 27. Scalable Design principles ❖ Avoid the single point of failure. ❖ Scale horizontally, not vertically. ❖ Push work as far away from the core as possible. ❖ API first. ❖ Cache everything, always. ❖ Provide as fresh as needed data. ❖ Design for maintenance and automation. ❖ Asynchronous rather than synchronous. ❖ Strive for statelessness.
  • 28. N-wide superscalar architecture: ❖ Superscalar architecture is called as N-wide architecture if it supports to fetch and dispatch of n instructions in every cycle.
  • 30. Multi-core architectures: ❖ Many cores fit on the single processor socket. ❖ 2)Also called Chip-Multiprocessor ❖ 3)These cores runs in parallel. ❖ 4)The architecture of a multicore processor enables ❖ communication between all available cores to ensure that the processing tasks are divided and assigned accurately.