0% found this document useful (0 votes)

5 views

unit 4

Parallel processing involves the simultaneous execution of multiple tasks across various processing elements to solve large problems efficiently. It offers advantages over serial computation by saving time and resources, allowing for the handling of larger problems, and optimizing hardware usage. The document outlines different types of parallelism, including hardware and software parallelism, as well as various models of parallel computers based on memory communication.

Uploaded by

malviyat42

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

unit 4

Uploaded by

malviyat42

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Parallel Processing

A parallel computer is a “Collection of processing elements that communicate

and co-operate to solve large problems fast”.

“Processing of multiple tasks simultaneous on multiple processor is

called parallel processing".

 Parallel computing is a type of computation in which many calculations or

the execution of processes are carried out simultaneously.

 Large problems can often be divided into smaller ones, which can then be solved at
the same time.

1
Serial Computation
Traditionally, software has been written for serial computation:
Tobe run on a single computer having a single Central Processing Unit (CPU)
What is Parallel Computing?
In the simplest sense, parallel computing is the simultaneous use of
multiple compute resources to solve a computational problem.
Advantages of Parallel Computing

 It saves time and money as many resources working together will

reduce the time and cut potential costs.
 It can be impractical to solve larger problems on Serial Computing.
 It can take advantage of non-local resources when the local resources
are finite.
 Serial Computing ‘wastes’ the potential computing power, thus
Parallel Computing makes better work of hardware.

4
Types of Parallelism

▪ Parallelism in Hardware
▪ Parallelism in a Uniprocessor
– Pipelining
– Superscalar
▪ SIMD instructions, Vector processors, GPUs
▪ Multiprocessor
– Symmetric shared-memory multiprocessors
– Distributed-memory multiprocessors
– Chip-multiprocessors a.k.a. Multi-cores
▪ Multicomputers a.k.a. clusters

▪ Parallelism in Software
▪ Instruction level parallelism
▪ Thread or Task-level parallelism
▪ Data parallelism
▪ Bit level parallelism
5
Taxonomy of Parallel Computers
▪ According to instruction and data streams (Flynn):
– Single instruction single data (SISD): this is the standard uniprocessor

– Single instruction, multiple data streams (SIMD):

▪ Same instruction is executed in all processors with different data
▪ E.g., Vector processors, SIMD instructions, GPUs

– Multiple instruction, single data streams (MISD):

▪ Different instructions on the same data
▪ Fault-tolerant computers, Near memory computing (Micron Automata processor).

– Multiple instruction, multiple data streams (MIMD): the “common”

multiprocessor
▪ Each processor uses it own data and executes its own program
▪ Most flexible approach
▪ Easier/cheaper to build by putting together “off-the-shelf ” processors
6
2
Taxonomy of Parallel Computers

7
2
Taxonomy of Parallel Computers
▪ According to memory communication model
– Shared address or shared memory: It emphasizes on control parallelism
than on data parallelism. In the shared memory model, multiple processes
execute on different processors independently, but they share a common
memory space. Due to any processor activity, if there is any change in any
memory location, it is visible to the rest of the processors.
▪ Processes in different processors can use the same virtual address space
▪ Any processor can directly access memory in another processor node
▪ Communication is done through shared memory variables
▪ Explicit synchronization with locks and critical sections
▪ Arguably easier to program??

5
Taxonomy of Parallel Computers
– Physically centralized memory, uniform memory access (UMA)
▪ All memory is allocated at same distance from all processors
▪ Also called symmetric multiprocessors (SMP)
▪ Memory bandwidth is fixed and must accommodate all processors  does not
scale to large number of processors
▪ Used in CMPs today (single-socket ones)

CPU CPU CPU CPU

Cache Cache Cache Cache

Interconnection

Main memory

3
Taxonomy of Parallel Computers
– Physically distributed memory, non-uniform memory access (NUMA)
▪ A portion of memory is allocated with each processor (node)
▪ Accessing local memory is much faster than remote memory
▪ If most accesses are to local memory than overall memory bandwidth increases
linearly with the number of processors
▪ Used in multi-socket CMPs E.g Intel Nehalem

CPU CPU CPU CPU

Node Nehalem-EP Nehalem-EP
Core 0 Core 1 Core 2 Core 3 Core 4 Core 5 Core 6 Core 7
L1 L1 L1 L1 L1 L1 L1 L1
L2 L2 L2 L2 L2 L2 L2 L2

Cache Cache Cache Cache

Shared L3 Cache (inclusive) Shared L3 Cache (inclusive)

IMC QPI QPI IMC

Mem. Mem. Mem. Mem.

DDR3 C
DDR3 A

DDR3 D
DDR3 B

DDR3 E

DDR3 F
I/O I/O

Interconnection

4
Taxonomy of Parallel Computers
▪ According to memory communication model

– Distributed address or message passing

 Like shared memory systems, distributed memory systems vary

widely but share a common characteristic. Distributed memory
systems require a communication network to connect inter-processor
memory.
 Processors have their own local memory. Memory addresses in one
processor do not map to another processor, so there is no concept of
global address space across all processors.
 Because each processor has its own local memory, it operates
independently. Changes it makes to its local memory have no
effect on the memory of other processors. Hence, the concept of
cache coherency does not apply.
 When a processor needs access to data in another processor, it is
usually the task of the programmer to explicitly define how and when
data is communicated. Synchronization between tasks is likewise the
programmer's responsibility.

5
Taxonomy of Parallel Computers
▪ According to memory communication model

– Distributed address or message passing

5
Types of Parallelism in Applications
▪ Instruction-level parallelism (ILP)
– Multiple instructions from the same instruction stream can be executed
concurrently
– Generated and managed by hardware (superscalar) or by compiler (VLIW)
– Limited in practice by data and control dependences
– Ex. Pipelining, Superscalar Architecture, Very large instruction word
(VLIW) etc.

 Example
1: A = B + C // Instruction I1
2: D = E - F // Instruction I2
3: G = A * D // Instruction I3

instructions completed in 2 cycle

instead of 3

7
Types of Parallelism in Applications
▪ Data-level parallelism (DLP)
– Instructions from a single stream operate concurrently on several data
– Limited by non-regular data manipulation patterns and by memory
bandwidth.
– Ex. SIMD: Vector processing, array processing
– Let’s take an example, summing the contents of an array of size N.
For a single-core system, one thread would simply sum the elements
[0] . . . [N − 1]. For a dual-core system, however, thread A, running
on core 0, could sum the elements [0] . . . [N/2 − 1] and while thread
B, running on core 1, could sum the elements [N/2] . . . [N − 1]. So
the Two threads would be running in parallel on separate computing
cores.

8
Types of Parallelism in Applications
▪ Thread-level or task-level parallelism (TLP)
– Multiple threads or instruction sequences from the same application
can be executed concurrently
– Generated by compiler/user and managed by compiler and hardware
– Limited in practice by communication/synchronization overheads
and by algorithm characteristics.
– Consider an example of task parallelism might involve two threads,
each performing a unique statistical operation on the array of
elements. Again, The threads are operating in parallel on separate
computing cores, but each is performing a unique operation.

8
Types of Parallelism in Applications
▪ Bit-level parallelism
– Bit-level parallelism is a form of parallel computing based on increasing
processor word size, depending on very-large-scale integration (VLSI)
technology. Enhancements in computers designs were done by
increasing bit-level parallelisms.

– For e.g., consider a case where an 8-bit processor must add two 16-bit
integers. The processor must first add the 8 lower-order bits from each
integer, then add the 8 higher-order bits, requiring two instructions to
complete a single operation. A 16-bit processor would be able to
complete the operation with single instruction.

Parallel Processing
No ratings yet
Parallel Processing
35 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
No ratings yet
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
80 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
Coa PPT-2
No ratings yet
Coa PPT-2
16 pages
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
No ratings yet
Parallel Computing: Er. Anupama Singh Department of Computer Science & Engg
22 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Parallel Computing
No ratings yet
Parallel Computing
19 pages
SISd
No ratings yet
SISd
17 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
90 pages
01 Intro Parallel Computing
No ratings yet
01 Intro Parallel Computing
40 pages
Lecture 2 General Parallelism Terms
No ratings yet
Lecture 2 General Parallelism Terms
22 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Parallel_computing
No ratings yet
Parallel_computing
32 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
KCS 713 Unit 1 Lecture 5
No ratings yet
KCS 713 Unit 1 Lecture 5
32 pages
Slides
No ratings yet
Slides
36 pages
Parallel Processing
No ratings yet
Parallel Processing
31 pages
Parallel Processing
No ratings yet
Parallel Processing
22 pages
Chapter 6 Parallel and Concurrent Computing
No ratings yet
Chapter 6 Parallel and Concurrent Computing
27 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
downloadfile (3)
No ratings yet
downloadfile (3)
16 pages
PDA_2
No ratings yet
PDA_2
105 pages
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
No ratings yet
Theory of Distributed Computing and Parallel Processing With Its Applications, Advantages and Disadvantages
11 pages
COA - Unit 4
No ratings yet
COA - Unit 4
84 pages
Intro To Parallel Computing
No ratings yet
Intro To Parallel Computing
127 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
CSA Presentation
No ratings yet
CSA Presentation
37 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
Parallel Computing
No ratings yet
Parallel Computing
28 pages
PARALLEL VS DISTRIBUTED COMPUTING
No ratings yet
PARALLEL VS DISTRIBUTED COMPUTING
9 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
Architecture
No ratings yet
Architecture
67 pages
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
28 pages
Parallel 123
No ratings yet
Parallel 123
28 pages
PP16 Lec4 Arch3
No ratings yet
PP16 Lec4 Arch3
23 pages
Motivation For Parallelism Motivation For Parallelism
No ratings yet
Motivation For Parallelism Motivation For Parallelism
6 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
02 Lecture Flynn IN
No ratings yet
02 Lecture Flynn IN
78 pages
Week1-Parallel-and-Distributed-Computing
No ratings yet
Week1-Parallel-and-Distributed-Computing
55 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Parallel and distributed computing
No ratings yet
Parallel and distributed computing
16 pages
Synopsis On "Massive Parallel Processing (MPP) "
No ratings yet
Synopsis On "Massive Parallel Processing (MPP) "
4 pages
Parallel Processor Computing Unit 1
No ratings yet
Parallel Processor Computing Unit 1
10 pages
Coa Unit 04
No ratings yet
Coa Unit 04
85 pages
Parallel Computer Models: PCA Chapter 1
No ratings yet
Parallel Computer Models: PCA Chapter 1
61 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Unit2_a
No ratings yet
Unit2_a
70 pages
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
From Everand
Nintendo 64 Architecture: Architecture of Consoles: A Practical Analysis, #8
Rodrigo Copetti
No ratings yet
Memory Basics Explained
From Everand
Memory Basics Explained
Alisa Turing
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Unit 3
No ratings yet
Unit 3
28 pages
Semester Proficiency Presentation 2[1]
No ratings yet
Semester Proficiency Presentation 2[1]
6 pages
Semester Proficiency Presentation
No ratings yet
Semester Proficiency Presentation
6 pages
Unit-III Tree-AVL
No ratings yet
Unit-III Tree-AVL
15 pages
Unit-IV Spanning_Tree
No ratings yet
Unit-IV Spanning_Tree
13 pages
Midsem_Level_Tree_AVL_Questions_with_Answers
No ratings yet
Midsem_Level_Tree_AVL_Questions_with_Answers
3 pages
Database Tuning Best Practices and Related Diagnostic Tools
No ratings yet
Database Tuning Best Practices and Related Diagnostic Tools
32 pages
Metode Weighted Moving Average397 1394 1 PB
No ratings yet
Metode Weighted Moving Average397 1394 1 PB
10 pages
Security+ Dump
No ratings yet
Security+ Dump
57 pages
302 000 100
No ratings yet
302 000 100
34 pages
HP 3PAR Thin Provisioning Best Practices: Technical White Paper
No ratings yet
HP 3PAR Thin Provisioning Best Practices: Technical White Paper
19 pages
Phython
No ratings yet
Phython
16 pages
(Name of Project) : Test Strategy
No ratings yet
(Name of Project) : Test Strategy
9 pages
M Configuring Virtual Interfaces
No ratings yet
M Configuring Virtual Interfaces
2 pages
Cn-18cse379t - Internet of Things
No ratings yet
Cn-18cse379t - Internet of Things
2 pages
Slot 25 26 27 28 29 Text Files
No ratings yet
Slot 25 26 27 28 29 Text Files
54 pages
EXPERIMENT 7 Share Printer and Folder
100% (1)
EXPERIMENT 7 Share Printer and Folder
7 pages
Blog 4 - Modelling Exceptions in Integration Flows (SAP Cloud Platform Integration) - SAP Blogs
No ratings yet
Blog 4 - Modelling Exceptions in Integration Flows (SAP Cloud Platform Integration) - SAP Blogs
15 pages
Docker
No ratings yet
Docker
35 pages
RamaKoteswararo+Oracle
No ratings yet
RamaKoteswararo+Oracle
5 pages
Sadwika Bobba: Work Experience
No ratings yet
Sadwika Bobba: Work Experience
2 pages
What Is Security?: Some Differences Between Traditional Security and Information Security
No ratings yet
What Is Security?: Some Differences Between Traditional Security and Information Security
8 pages
Resumen Preguntas
No ratings yet
Resumen Preguntas
13 pages
MERN Stack Syllabus Compressed
No ratings yet
MERN Stack Syllabus Compressed
11 pages
Event Handling in Applet Programming
No ratings yet
Event Handling in Applet Programming
26 pages
current_log
No ratings yet
current_log
70 pages
UNIT 1 DBMS R23
No ratings yet
UNIT 1 DBMS R23
36 pages
Ece3501 - Iot Fundamentals Lab Record: School of Electronics Engineering (Sense) B.Tech - Ece/Ecm
No ratings yet
Ece3501 - Iot Fundamentals Lab Record: School of Electronics Engineering (Sense) B.Tech - Ece/Ecm
13 pages
M - 05 - SNOASIS Presentation
100% (1)
M - 05 - SNOASIS Presentation
28 pages
Cse2icx Assignment1
No ratings yet
Cse2icx Assignment1
4 pages
Product Quality Testing
No ratings yet
Product Quality Testing
2 pages
OOP MCQ's - Final
No ratings yet
OOP MCQ's - Final
19 pages
Notes
No ratings yet
Notes
19 pages
Cloudera Manager Administration Guide
No ratings yet
Cloudera Manager Administration Guide
78 pages
RPT Internal Scheme Report
No ratings yet
RPT Internal Scheme Report
12 pages
CCIE Training
No ratings yet
CCIE Training
80 pages

unit 4

Uploaded by

unit 4

Uploaded by

Parallel Processing

A parallel computer is a “Collection of processing elements that communicate

“Processing of multiple tasks simultaneous on multiple processor is

 Parallel computing is a type of computation in which many calculations or

 It saves time and money as many resources working together will

– Single instruction, multiple data streams (SIMD):

– Multiple instruction, single data streams (MISD):

– Multiple instruction, multiple data streams (MIMD): the “common”

CPU CPU CPU CPU

Cache Cache Cache Cache

CPU CPU CPU CPU

Cache Cache Cache Cache

IMC QPI QPI IMC

Mem. Mem. Mem. Mem.

– Distributed address or message passing

 Like shared memory systems, distributed memory systems vary

– Distributed address or message passing

instructions completed in 2 cycle

You might also like