0% found this document useful (0 votes)

230 views

Parallel Computers Networking PDF

There is a continual demand for greater computational speed to solve problems like modeling large DNA structures, global weather forecasting, and modeling astronomical bodies. These "grand challenge problems" cannot be solved in a reasonable time using today's computers. Parallel computing, which uses more than one processor simultaneously, is often used to solve such problems faster. The speedup from parallel computing is measured as the execution time on one processor divided by the execution time on multiple processors. According to Amdahl's law, the maximum speedup is limited by the portion of a problem that cannot be parallelized. Superlinear speedup is possible in some cases like parallel searching, where the solution may be found faster.

Uploaded by

Hemprasad Badgujar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

230 views

Parallel Computers Networking PDF

Uploaded by

Hemprasad Badgujar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

slides1-1

Chapter 1

Parallel Computers

slides1-2

Demand for Computational Speed

Continual demand for greater computational speed from a computer
system than is currently possible

Areas requiring great computational speed include numerical

modeling and simulation of scientific and engineering problems.

Computations must be completed within a reasonable time period.

slides1-3

Grand Challenge Problems

A grand challenge problem is one that cannot be solved in a
reasonable amount of time with todays computers.

Obviously, an execution time of 10 years is always unreasonable.

Examples

Modeling large DNA structures

Global weather forecasting

Modeling motion of astronomical bodies.

slides1-4

Weather Forecasting
Atmosphere modeled by dividing it into 3-dimensional cells.
Calculations of each cell repeated many times to model passage of
time.

slides1-5

Global Weather Forecasting Example

Whole global atmosphere divided into cells of size 1 mile 1 mile
1 mile to a height of 10 miles (10 cells high) - about 5 108 cells.
Suppose each calculation requires 200 floating point operations. In
one time step, 1011 floating point operations necessary.
To forecast the weather over 7 days using 1-minute intervals, a
computer operating at 1Gflops (109 floating point operations/s)
would take 106 seconds or over 10 days.
To perform the calculation in 5 minutes would require a computer
operating at 3.4 Tflops (3.4 1012 floating point operations/sec).
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-6

Modeling Motion of Astronomical Bodies

Each body attracted to each other body by gravitational forces.
Movement of each body predicted by calculating total force on each
body. With N bodies, N 1 forces to calculate for each body, or
approx. N2 calculations. (N log2 N for an efficient approx. algorithm.)
After determining new positions of bodies, calculations repeated.

A galaxy might have, say, 1011 stars. Even if each calculation could
be done in 1 s (an extremely optimistic figure), it would take 109
years for one iteration using the N2 algorithm and almost a year for
one iteration using an efficient N log2 N approximate algorithm.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-7

Astrophysical N-body simulation by Scott Linssen (undergraduate

University of North Carolina at Charlotte [UNCC] student).
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-8

Parallel Computing
Using more than one computer, or a computer with more than one
processor, to solve a problem.

Motives
Usually faster computation - very simple idea - that n computers
operating simultaneously can achieve the result n times faster - it
will not be n times faster for various reasons.

Other motives include: fault tolerance, larger amount of memory

available, ...
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-9

Background
Parallel computers - computers with more than one processor - and
their programming - parallel programming - has been around for
more than

40 years.

slides1-10

Gill writes in 1958:

... There is therefore nothing new in the idea of parallel
programming, but its application to computers. The author cannot
believe that there will be any insuperable difficulty in extending it to
computers. It is not to be expected that the necessary programming
techniques will be worked out overnight. Much experimenting
remains to be done. After all, the techniques that are commonly
used in programming today were only won at the cost of
considerable toil several years ago. In fact the advent of parallel
programming may do something to revive the pioneering spirit in
programming which seems at the present to be degenerating into a
rather dull and routine occupation ...
Gill, S. (1958), Parallel Programming, The Computer Journal, vol. 1, April, pp. 2-10.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-11

Notation
p = number of processors or processes
n = number of dtata items (used later)

slides1-12

Speedup Factor
S(p) =

ts
Execution time using one processor (best sequential algorithm)
=
Execution time using a multiprocessor with p processors
tp

where ts is execution time on a single processor and tp is execution

time on a multiprocessor.
S(p) gives increase in speed by using multiprocessor.

Notice use best sequential algorithm with single processor system.

Underlying algorithm for parallel implementation might be (and is

usually) different.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-13

Speedup factor can also be cast in terms of computational steps:

S(p) =

Number of computational steps using one processor

Number of parallel computational steps with p processors

Can also extend time complexity to parallel computations - see later.

slides1-14

Maximum Speedup
Maximum speedup is usually p with p processors (linear speedup).
Possible to get superlinear speedup (greater than p) but usually a
specific reason such as:

Extra memory in multiprocessor system

Nondeterministic algorithm

slides1-15

Maximum Speedup - Amdahls law

ts
fts

(1 - f)ts

Serial section

Parallelizable sections

(a) One processor

(b) Multiple
processors

p processors

(1 - f)ts /p

slides1-16

Speedup factor is given by:

S(p) =

ts
p
=
fts + (1 f )ts /p
1 + (p 1)f

This equation is known as Amdahls law

slides1-17

Speedup against number of processors

f = 0%

16
12
f = 5%
8
4

f = 10%
f = 20%

4
8
12 16 20
Number of processors, p

Even with infinite number of processors, maximum speedup limited

to 1/f.
Example: With only 5% of computation being serial, maximum
speedup is 20, irrespective of number of processors.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-18

Superlinear Speedup example - Searching

(a) Searching each sub-space sequentially
Start

Time
ts

ts/p
Sub-space
search

t
xts/p

Solution found

x indeterminate

slides1-19

(b) Searching each sub-space in parallel

Solution found
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-20

Speed-up is then given by

x t s + t
----

p
S ( p ) = -----------------------------t

slides1-21

Worst case for sequential search when solution found in last subspace search. Then parallel version offers greatest benefit, i.e.

p 1 t + t
- s
----------p
S ( p ) = ---------------------------------------- as t tends to zero
t
Least advantage for parallel version when solution found in first subspace search of the sequential search, i.e.

t
S( p) =
----- = 1
t
Actual speed-up depends upon which subspace holds solution but
could be extremely large.

slides1-22

Types of Parallel Computers

Two principal types:

Shared memory multiprocessor

Distributed memory multicomputer

slides1-23

Shared Memory Multiprocessor

slides1-24

Conventional Computer
Consists of a processor executing a program stored in a (main)
memory:

Main memory
Instructions (to processor)
Data (to or from processor)
Processor

Each main memory location located by its address. Addresses start

at 0 and extend to 2b 1 when there are b bits (binary digits) in
address.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-25

Natural way to extend single processor model - have multiple

processors connected to multiple memory modules, such that each
processor can access any memory module - so-called shared
memory configuration:

One
address
space

Memory modules

Interconnection
network

Processors
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-26

Simplistic view of a small shared memory

multiprocessor
Processors

Shared memory

Bus

Examples:

Dual Pentiums

Quad Pentiums

slides1-27

Quad Pentium Shared Memory Multiprocessor

Processor

L1 cache

L2 Cache

Bus interface

Processor/
memory
bus
I/O interface

Memory Controller

I/O bus

Shared memory

Memory

slides1-28

Programming Shared Memory Multiprocessors

Threads - programmer decomposes the program into individual
parallel sequences, (threads), each being able to access
variables declared outside threads.
Example Pthreads
A sequential programming language with preprocessor compiler
directives to declare shared variables and specify parallelism.
Example OpenMP - industry standard - needs OpenMP compiler
A sequential programming language with added syntax to
declare shared variables and specify parallelism.
Example UPC (Unified Parallel C) - needs a UPC compiler.
A parallel programming language with syntax to express
parallelism, in which the compiler creates the appropriate
executable code for each processor (not now common)
A sequential programming language and ask a parallelizing
compiler to convert it into parallel executable code. - also not
now common
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-29

Message-Passing Multicomputer
Complete computers connected through an interconnection
network:

Interconnection
network
Messages
Processor

Local
memory

Computers
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-30

Interconnection Networks
With direct links between computers

Exhausive connections

2-dimensional and 3-dimensional meshs

Hypercube

Using Switches:

Crossbar

Trees

Multistage interconnection networks

slides1-31

Two-dimensional array (mesh)

Links

Computer/
processor

Also three-dimensional - used in some large high performance

systems.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-32

Three-dimensional hypercube

110
100

111

101

010
000

011
001

slides1-33

Four-dimensional hypercube

0110
0100

0111
0101

0010
0000

1100

0011
0001

1110

1111
1101

1010

1000

1011
1001

Hypercubes popular in 1980s - not now

slides1-34

Crossbar switch

Memories

Processors

Switches

slides1-35

Tree

Root
Links

Switch
element

Processors

slides1-36

000
001
010
011
Outputs

100
101

110
111

slides1-37

Distributed Shared Memory

Making the main memory of a group of interconnected computers
look as though it is a single memory with a single address space.
Then can use shared memory programming techniques.

Interconnection
network
Messages
Processor

Shared
memory
Computers
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, 2004 Pearson Education Inc. All rights reserved.

slides1-38

Flynns Classifications
Flynn (1966) created a classification for computers based upon
instruction streams and data streams:

Single instruction stream-single data stream (SISD) computer

In a single processor computer, a single stream of instructions is
generated from the program. The instructions operate upon a single
stream of data items. Flynn called this single processor computer a
single instruction stream-single data stream (SISD) computer.

slides1-39

Multiple Instruction Stream-Multiple Data Stream (MIMD)

Computer
General-purpose multiprocessor system - each processor has a
separate program and one instruction stream is generated from
each program for each processor. Each instruction operates upon
different data.

Both the shared memory and the message-passing multiprocessors

so far described are in the MIMD classification.

slides1-40

Single Instruction Stream-Multiple Data Stream (SIMD)

Computer
A specially designed computer in which a single instruction stream
is from a single program, but multiple data streams exist. The
instructions from the program are broadcast to more than one
processor. Each processor executes the same instruction in
synchronism, but using different data.

Developed because there are a number of important applications

that mostly operate upon arrays of data.

slides1-41

Multiple Program Multiple Data (MPMD)

Structure
Within the MIMD classification, which we are concerned with, each
processor will have its own program to execute:

Program
Instructions

Processor

Data

slides1-42

Single Program Multiple Data (SPMD) Structure

Single source program is written and each processor will execute its
personal copy of this program, although independently and not in
synchronism.

The source program can be constructed so that parts of the

program are executed by certain computers and not others
depending upon the identity of the computer.

slides1-43

Networked Computers as a Multicomputer

Platform
A network of computers became a very attractive alternative to
expensive supercomputers and parallel computer systems for highperformance computing in early 1990s.

Several early projects.

Notable:

Berkeley NOW (network of workstations) project.

NASA Beowulf project. (Will look at this one later)

Term now used - cluster computing.

slides1-44

Key advantages:

Very high performance workstations and PCs readily

available at low cost.

The latest processors can easily be incorporated into

the system as they become available.

Existing software can be used or modified.

slides1-45

Message Passing Parallel Programming

Software Tools for Clusters
Parallel Virtual Machine (PVM) - developed in late 1980s. Became
very popular.

Message-Passing Interface (MPI) - standard defined in 1990s.

Both provide a set of user-level libraries for message passing. Use

with regular programming languages (C, C++, ...).

slides1-46

Beowulf Clusters*
A group of interconnected commodity computers achieving high
performance with low cost.

Typically using commodity interconnects - high speed Ethernet, and

Linux OS.

* Beowulf comes from name given by NASA Goddard Space Flight

Center cluster project.

slides1-47

Cluster Interconnects

Originally fast Ethernet on low cost clusters

Gigabit Ethernet - easy upgrade path
More Specialized/Higher Performance

Myrinet - 2.4 Gbits/sec - disadvantage: single vendor

cLan
SCI (Scalable Coherent Interface)
QNet
Infiniband - may be important as infininbnand
interfaces may be intergrated on next generation PCs

slides1-48

Dedicated cluster with a master node

Dedicated Cluster

User

Compute nodes
Master node
Up link
Switch

2nd Ethernet
interface

External network

Kali Linux Documentation
No ratings yet
Kali Linux Documentation
22 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Document: Structural Calculation
No ratings yet
Document: Structural Calculation
209 pages
LAB Manual: Course: CSC271: Database Systems
No ratings yet
LAB Manual: Course: CSC271: Database Systems
55 pages
Review Question #5
0% (2)
Review Question #5
2 pages
Seven Steps of Telehealth Planning
No ratings yet
Seven Steps of Telehealth Planning
6 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
Parallel and Distributed Computing Lecture 02
No ratings yet
Parallel and Distributed Computing Lecture 02
17 pages
An Introduction To Parallel Programming - Lecture Notes, Study Material and Important Questions, Answers
No ratings yet
An Introduction To Parallel Programming - Lecture Notes, Study Material and Important Questions, Answers
4 pages
Parallel Processing Assignment 1
No ratings yet
Parallel Processing Assignment 1
14 pages
Parallel and Distributed Computing Lecture 03
No ratings yet
Parallel and Distributed Computing Lecture 03
44 pages
Slide 2 Process Model
No ratings yet
Slide 2 Process Model
41 pages
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
No ratings yet
Fallsem2019-20 Cse4001 Eth Vl2019201001348 Reference Material Cse4001 Parallel and Distributed Computing May 2019 (003) 18
4 pages
Parallel and Distributed Computing Architectures A PDF
No ratings yet
Parallel and Distributed Computing Architectures A PDF
286 pages
CCS354 NS-UNIT-2 KEY MANAGEMENT & AUTHENTICATION Full
No ratings yet
CCS354 NS-UNIT-2 KEY MANAGEMENT & AUTHENTICATION Full
60 pages
CS416 - Parallel and Distributed Computing: Lecture # 01
No ratings yet
CS416 - Parallel and Distributed Computing: Lecture # 01
20 pages
Multi-Core Programming Digital Edition (06!29!06)
No ratings yet
Multi-Core Programming Digital Edition (06!29!06)
362 pages
Amdahl
No ratings yet
Amdahl
2 pages
Vu Study M Parallel and Distributed Computer
No ratings yet
Vu Study M Parallel and Distributed Computer
29 pages
Deadlock Assignment
No ratings yet
Deadlock Assignment
6 pages
Step1. Open The Data/bank Data - CSV Dataset
No ratings yet
Step1. Open The Data/bank Data - CSV Dataset
3 pages
Distributed File System - File Service Architecture
No ratings yet
Distributed File System - File Service Architecture
51 pages
Relational Algebra and SQL
No ratings yet
Relational Algebra and SQL
68 pages
CS8481 - Set4
0% (1)
CS8481 - Set4
7 pages
OOP - Final - LAB - Exam II A-B - Iqra Shahzad
0% (1)
OOP - Final - LAB - Exam II A-B - Iqra Shahzad
2 pages
Assignment: Parallel and Distributed Computing Submitted To: Sir Shoaib Date: 25-03-2019
No ratings yet
Assignment: Parallel and Distributed Computing Submitted To: Sir Shoaib Date: 25-03-2019
5 pages
Remote Sharing Application
100% (1)
Remote Sharing Application
25 pages
PPS - Unit 1
No ratings yet
PPS - Unit 1
69 pages
Computer Graphics Question For Final Exam
No ratings yet
Computer Graphics Question For Final Exam
3 pages
Generative Graphics
No ratings yet
Generative Graphics
38 pages
Computer Science and Engineering - 2019 Scheme s4 Syllabus - Ktustudents - in
No ratings yet
Computer Science and Engineering - 2019 Scheme s4 Syllabus - Ktustudents - in
153 pages
02-ProtocolArchitecture - William Stallings
No ratings yet
02-ProtocolArchitecture - William Stallings
47 pages
Database Assignment
No ratings yet
Database Assignment
4 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
Computer Networks: Understanding Computers: Today and Tomorrow, 13th Edition
No ratings yet
Computer Networks: Understanding Computers: Today and Tomorrow, 13th Edition
60 pages
Professional Development Lab Manual - New
100% (1)
Professional Development Lab Manual - New
43 pages
Se Module 2 PPT
No ratings yet
Se Module 2 PPT
86 pages
2017-CE-008 Lab 04
No ratings yet
2017-CE-008 Lab 04
12 pages
Parallel & Distributed Computing
No ratings yet
Parallel & Distributed Computing
47 pages
CSC 431 - Computer System Performance Evaluation (2 Units)
No ratings yet
CSC 431 - Computer System Performance Evaluation (2 Units)
56 pages
Devops Record
No ratings yet
Devops Record
109 pages
Advance Java Practical File
71% (7)
Advance Java Practical File
24 pages
Cs8591 - CN Unit 4
No ratings yet
Cs8591 - CN Unit 4
33 pages
Evolution of Programming Methodologies and Consepts of Oop
100% (1)
Evolution of Programming Methodologies and Consepts of Oop
48 pages
Introduction To Algorithms
No ratings yet
Introduction To Algorithms
62 pages
Faculty of Engineering and Technology Semester End Examination Question Paper
100% (1)
Faculty of Engineering and Technology Semester End Examination Question Paper
2 pages
OOAD Question Bank
100% (2)
OOAD Question Bank
5 pages
Classical Analysis
No ratings yet
Classical Analysis
6 pages
Assignment 2b. Programming Fundamentals
No ratings yet
Assignment 2b. Programming Fundamentals
1 page
Object Oriented Programming (Lab) Assignment # 2: Instructions
No ratings yet
Object Oriented Programming (Lab) Assignment # 2: Instructions
2 pages
Ict LC#1
No ratings yet
Ict LC#1
14 pages
Practical - Image Editing Tool 1 PDF
0% (1)
Practical - Image Editing Tool 1 PDF
1 page
Data Cleaning Data Transformation Data Reduction Discretization and Generating Concept Hierarchies
No ratings yet
Data Cleaning Data Transformation Data Reduction Discretization and Generating Concept Hierarchies
25 pages
EC2303 Computer Architecture and Organization QUESTION PAPER
No ratings yet
EC2303 Computer Architecture and Organization QUESTION PAPER
4 pages
CSL 204 Operating Systems Lab
No ratings yet
CSL 204 Operating Systems Lab
81 pages
3D Game Engine Design A Practical Approach to Real Time Computer Graphics 2nd Edition David H. Eberly - Get instant access to the full ebook with detailed content
100% (2)
3D Game Engine Design A Practical Approach to Real Time Computer Graphics 2nd Edition David H. Eberly - Get instant access to the full ebook with detailed content
57 pages
Solved Assignment - Parallel Processing
63% (8)
Solved Assignment - Parallel Processing
29 pages
352ccs - Lab Manual
No ratings yet
352ccs - Lab Manual
48 pages
2161CS136 Distributed Systems: Unit II Process and Distributed Objects Lecture No.12 TCP Stream Communication
No ratings yet
2161CS136 Distributed Systems: Unit II Process and Distributed Objects Lecture No.12 TCP Stream Communication
14 pages
Windows Elements
No ratings yet
Windows Elements
5 pages
Computer and Network Security: Simplified Data Encryption Standard (DES)
No ratings yet
Computer and Network Security: Simplified Data Encryption Standard (DES)
21 pages
Practice Assignment 11 Sol 12453
100% (1)
Practice Assignment 11 Sol 12453
6 pages
Cisco MCQ
No ratings yet
Cisco MCQ
9 pages
TQ ReactPhoneBook 070322 1229
No ratings yet
TQ ReactPhoneBook 070322 1229
3 pages
Mediapackage Guide
No ratings yet
Mediapackage Guide
168 pages
2022 - JAN - INI-CET - SPOT - Admission Round Physical - 4.14 - Notice - 2022 For SPOT Round Final
No ratings yet
2022 - JAN - INI-CET - SPOT - Admission Round Physical - 4.14 - Notice - 2022 For SPOT Round Final
1 page
NROER
No ratings yet
NROER
3 pages
Site Audit (Details) - Mckinleyrice - Com - 2020-10-31 PDF
No ratings yet
Site Audit (Details) - Mckinleyrice - Com - 2020-10-31 PDF
65 pages
Cbox PDF
No ratings yet
Cbox PDF
1 page
AppliedPM Flyer Su17 Updated
No ratings yet
AppliedPM Flyer Su17 Updated
2 pages
Hardware Specification Available
No ratings yet
Hardware Specification Available
2 pages
Renaissance Club RPB
No ratings yet
Renaissance Club RPB
23 pages
National Rural Telemedicine Network For India - MoHFW
100% (1)
National Rural Telemedicine Network For India - MoHFW
30 pages
Alumni Meet Pics: Instrumentation Department
No ratings yet
Alumni Meet Pics: Instrumentation Department
3 pages
SAI VConnect Configure 1.4
No ratings yet
SAI VConnect Configure 1.4
8 pages
Communication Skills Communication Skills Communication Skills
No ratings yet
Communication Skills Communication Skills Communication Skills
2 pages
HR Project On Employee Motivation in Anglo French Textiles Limited - 151284769
No ratings yet
HR Project On Employee Motivation in Anglo French Textiles Limited - 151284769
69 pages
University of Pune: Ph. D. Entrance Test Question Paper-II Format and Syllabus
No ratings yet
University of Pune: Ph. D. Entrance Test Question Paper-II Format and Syllabus
279 pages
Item Description Unit Qty. Rate Amount A. Sub Structure 1/1.0. Excavation & Earth Work
No ratings yet
Item Description Unit Qty. Rate Amount A. Sub Structure 1/1.0. Excavation & Earth Work
6 pages
Cantilever Retaining Wall Design
100% (1)
Cantilever Retaining Wall Design
6 pages
A29 Multi Level Car Parking by Labview PDF
100% (1)
A29 Multi Level Car Parking by Labview PDF
34 pages
Overclocking Overclocking Clarkdale in A LGA 1156 Motherboardclarkdale in A LGA 1156 Motherboard
No ratings yet
Overclocking Overclocking Clarkdale in A LGA 1156 Motherboardclarkdale in A LGA 1156 Motherboard
14 pages
Installing A DONN Grid System
No ratings yet
Installing A DONN Grid System
2 pages
Midea Outdoor Unit
No ratings yet
Midea Outdoor Unit
4 pages
Practical Socket C
No ratings yet
Practical Socket C
91 pages
Monkey Jam Quick Guide
No ratings yet
Monkey Jam Quick Guide
1 page
Whispers-In-The-Dark
No ratings yet
Whispers-In-The-Dark
3 pages
Trunk Posbox Manual
No ratings yet
Trunk Posbox Manual
19 pages
AutoCad Tutorial Layouts
No ratings yet
AutoCad Tutorial Layouts
6 pages
A Modern Approach To Keyline Design For Your Property Part 1
100% (1)
A Modern Approach To Keyline Design For Your Property Part 1
9 pages
Old Hickory Price Guide 4 14 21
No ratings yet
Old Hickory Price Guide 4 14 21
4 pages
New Format - Diploma
No ratings yet
New Format - Diploma
1 page
SB MultiVIV HighStaticDucted ARNU363B8A4!12!15
No ratings yet
SB MultiVIV HighStaticDucted ARNU363B8A4!12!15
2 pages
All in One MCSD
No ratings yet
All in One MCSD
1,051 pages
Piling BQ
100% (1)
Piling BQ
4 pages
Gratings Brochure
100% (2)
Gratings Brochure
22 pages
Flare S4 Max
No ratings yet
Flare S4 Max
10 pages
Overall Achi
No ratings yet
Overall Achi
85 pages
Legends: Auto C/O Switch (32 A Rated)
No ratings yet
Legends: Auto C/O Switch (32 A Rated)
1 page
How To Create New User On ADC and File Server
No ratings yet
How To Create New User On ADC and File Server
7 pages
Shekhar Khosla Am
No ratings yet
Shekhar Khosla Am
5 pages
Transform Your Space With Expand Furniture's Multifunctional Pieces
No ratings yet
Transform Your Space With Expand Furniture's Multifunctional Pieces
9 pages
IBM 4690 User Guide v1998
No ratings yet
IBM 4690 User Guide v1998
40 pages
Mvs
100% (5)
Mvs
98 pages
Coimbatore - DCR
No ratings yet
Coimbatore - DCR
59 pages
Carrier
100% (1)
Carrier
190 pages

Parallel Computers Networking PDF

Uploaded by

Parallel Computers Networking PDF

Uploaded by

slides1-1

Demand for Computational Speed

Areas requiring great computational speed include numerical

Computations must be completed within a reasonable time period.

Grand Challenge Problems

Obviously, an execution time of 10 years is always unreasonable.

Modeling large DNA structures

Global weather forecasting

Modeling motion of astronomical bodies.

Global Weather Forecasting Example

Modeling Motion of Astronomical Bodies

Astrophysical N-body simulation by Scott Linssen (undergraduate

Other motives include: fault tolerance, larger amount of memory

Gill writes in 1958:

where ts is execution time on a single processor and tp is execution

Notice use best sequential algorithm with single processor system.

Underlying algorithm for parallel implementation might be (and is

Speedup factor can also be cast in terms of computational steps:

Number of computational steps using one processor

Can also extend time complexity to parallel computations - see later.

Extra memory in multiprocessor system

Maximum Speedup - Amdahls law

(a) One processor

Speedup factor is given by:

This equation is known as Amdahls law

Speedup against number of processors

Even with infinite number of processors, maximum speedup limited

Superlinear Speedup example - Searching

(b) Searching each sub-space in parallel

Speed-up is then given by

Types of Parallel Computers

Shared memory multiprocessor

Distributed memory multicomputer

Shared Memory Multiprocessor

Each main memory location located by its address. Addresses start

Natural way to extend single processor model - have multiple

Simplistic view of a small shared memory

Quad Pentium Shared Memory Multiprocessor

Programming Shared Memory Multiprocessors

2-dimensional and 3-dimensional meshs

Multistage interconnection networks

Two-dimensional array (mesh)

Also three-dimensional - used in some large high performance

Hypercubes popular in 1980s - not now

Multistage Interconnection Network

Distributed Shared Memory

Single instruction stream-single data stream (SISD) computer

Multiple Instruction Stream-Multiple Data Stream (MIMD)

Both the shared memory and the message-passing multiprocessors

Single Instruction Stream-Multiple Data Stream (SIMD)

Developed because there are a number of important applications

Multiple Program Multiple Data (MPMD)

Single Program Multiple Data (SPMD) Structure

The source program can be constructed so that parts of the

Networked Computers as a Multicomputer

Several early projects.

Berkeley NOW (network of workstations) project.

Term now used - cluster computing.

Very high performance workstations and PCs readily

The latest processors can easily be incorporated into

Existing software can be used or modified.

Message Passing Parallel Programming

Message-Passing Interface (MPI) - standard defined in 1990s.

Both provide a set of user-level libraries for message passing. Use

Typically using commodity interconnects - high speed Ethernet, and

* Beowulf comes from name given by NASA Goddard Space Flight

Originally fast Ethernet on low cost clusters

Myrinet - 2.4 Gbits/sec - disadvantage: single vendor

Dedicated cluster with a master node

You might also like