0% found this document useful (0 votes)
11 views

LECTURE 1 - Intro to Parallel Computing

This document is an introduction to parallel computing, covering its motivation, scope, and terminology. It discusses the evolution from serial to parallel computing, highlighting the benefits of parallelism in solving complex problems and improving performance across various applications. Additionally, it outlines different classifications of parallel computers and key concepts such as granularity, scalability, and parallel overhead.

Uploaded by

2024793147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

LECTURE 1 - Intro to Parallel Computing

This document is an introduction to parallel computing, covering its motivation, scope, and terminology. It discusses the evolution from serial to parallel computing, highlighting the benefits of parallelism in solving complex problems and improving performance across various applications. Additionally, it outlines different classifications of parallel computers and key concepts such as granularity, scalability, and parallel overhead.

Uploaded by

2024793147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

CSC580

Parallel Processing
Lecture 1: Introduction to Parallel Computing

PREPARED BY: SALIZA RAMLY


Topic Description
This topic introduces the students:
◦ Motivating Parallelism
◦ Scope of Parallel Computing
◦ Parallel Computing Terminology

SALIZA RAMLY - CSC580


Motivating
Parallelism

SALIZA RAMLY - CSC580


For over 40 years, virtually all computers
have followed a common machine model
known as the von Neumann computer.
Fundamentals Named after the Hungarian mathematician
John von Neumann.
of Parallel
Computing –
Von Neumann A von Neumann computer uses the stored-
program concept. The CPU executes a
Architecture stored program that specifies a sequence of
read and write operations on the memory.

SALIZA RAMLY - CSC580


von Neumann Architecture
Comprised of four main components:
oMemory
oControl Unit
oArithmetic Logic Unit
oInput/Output

Parallel computers still follow this basic design, just multiplied in


units. The basic, fundamental architecture remains the same.

SALIZA RAMLY - CSC580


Basic Design
Memory is used to store both program and data
instructions
Program instructions are coded data which tell the
computer to do something
Data is simply information to be used by the program
A central processing unit (CPU) gets instructions
and/or data from memory, decodes the instructions
and then sequentially performs them.

SALIZA RAMLY - CSC580


What is IN ORDER TO
UNDERSTAND PARALLEL

Parallel COMPUTING, WE MUST


FIRST UNDERSTAND THE
MEANING OF SERIAL

Computing? COMPUTING

SALIZA RAMLY - CSC580


Serial Computing
Traditionally, software has been written for
serial computation:
oA problem is broken into a discrete series of
instructions
oInstructions are executed sequentially one after
another
oExecuted on a single processor
oOnly one instruction may execute at any moment
in time

SALIZA RAMLY - CSC580


Parallel Computing
In the simplest sense, parallel computing is the
simultaneous use of multiple compute resources to
solve a computational problem:
o A problem is broken into discrete parts that can be solved
concurrently
o Each part is further broken down to a series of instructions
o Instructions from each part execute simultaneously on
different processors
o An overall control/coordination mechanism is employed

SALIZA RAMLY - CSC580


THIS IS A LEGITIME
Why Parallel QUESTION!
PARALLEL

Computing? COMPUTING IS
COMPLEX ON ANY
ASPECT!

SALIZA RAMLY - CSC580


Limitations of Serial Computing
Limits to serial computing •both physical and practical reasons pose significant constraints to simply building ever faster serial
computers.

•the speed of a serial computer is directly dependent upon how fast data can move through hardware.
Transmission speeds Absolute limits are the speed of light (30 cm/nanosecond) and the transmission limit of copper wire (9
cm/nanosecond). Increasing speeds necessitate increasing proximity of processing elements.

Limits to miniaturization •processor technology is allowing an increasing number of transistors to be placed on a chip. However, even
with molecular or atomic-level components, a limit will be reached on how small components can be.

Economic limitations •it is increasingly expensive to make a single processor faster. Using a larger number of moderately fast
commodity processors to achieve the same (or better) performance is less expensive.

SALIZA RAMLY - CSC580


Why Use Parallel Computing?
Main Reasons:

Save • Save time - wall clock time

Solve • Solve larger / more complex problems

• Provide concurrency - do multiple things at


Provide the same time

SALIZA RAMLY - CSC580


Why Use Parallel Computing?
Other Reasons:
Taking advantage of • using available compute resources on a wide area network, or
even the Internet when local compute resources are scarce.
non-local resources

• using multiple "cheap" computing resources instead of paying


Cost savings for time on a supercomputer.

Overcoming • single computers have very finite memory resources. For


large problems, using the memories of multiple computers
memory constraints may overcome this obstacle.

SALIZA RAMLY - CSC580


Parallel Computing: what for?
Parallel computing is an evolution of serial computing that attempts to emulate what
has always been the state of affairs in the natural world: many complex, interrelated
events happening at the same time, yet within a sequence.
Some examples:
o Planetary and galactic orbits
o Weather and ocean patterns
o Tectonic plate drift
o Rush hour traffic in Paris
o Automobile assembly line
o Daily operations within a business
o Building a shopping mall
o Ordering a hamburger at the drive through.

SALIZA RAMLY - CSC580


Parallel Computing: what for?
Traditionally, parallel computing has been considered to be "the high end of
computing" and has been motivated by numerical simulations of complex systems
and "Grand Challenge Problems" such as:
oweather and climate
ochemical and nuclear reactions
obiological, human genome
ogeological, seismic activity
omechanical devices - from prosthetics to spacecraft
oelectronic circuits
omanufacturing processes

SALIZA RAMLY - CSC580


Parallel Computing: what for?
Today, commercial applications are providing an equal or greater driving force in the
development of faster computers. These applications require the processing of large
amounts of data in sophisticated ways. Example applications include:
oparallel databases, data mining
ooil exploration
oweb search engines, web based business services
ocomputer-aided diagnosis in medicine
omanagement of national and multi-national corporations
oadvanced graphics and virtual reality, particularly in the entertainment industry
onetworked video and multi-media technologies
ocollaborative work environments

SALIZA RAMLY - CSC580


Parallel ULTIMATELY, PARALLEL
COMPUTING IS AN
AT TEMPT TO MAXIMIZE

Computing: THE INFINITE BUT


SEEMINGLY SCARCE
COMMODITY CALLED

what for? TIME .

SALIZA RAMLY - CSC580


Scope of
Who is Using Parallel
Parallel Computing?
Computing
Applications
SALIZA RAMLY - CSC580
Parallel Computing Applications
Parallelism finds applications in very diverse application domains for different
motivating reasons.
These range from improved application performance to cost considerations.

Who is Using Parallel Computing?


o Applications in Engineering and Design
o Scientific Applications
o Commercial Applications
o Applications in Computer Systems

SALIZA RAMLY - CSC580


(1) Applications in Engineering and Design
o Design of airfoils (optimizing lift, drag, stability), internal combustion engines
(optimizing charge distribution, burn), high-speed circuits (layouts for delays
and capacitive and inductive effects), and structures (optimizing structural
integrity, design parameters, cost, etc.).
o Design and simulation of micro- and nano-scale systems.
o Process optimization, operations research.

SALIZA RAMLY - CSC580


(2) Scientific Applications
o Functional and structural characterization of genes and proteins.
o Advances in computational physics and chemistry have explored new
materials, understanding of chemical pathways, and more efficient processes.
o Applications in astrophysics have explored the evolution of galaxies,
thermonuclear processes, and the analysis of extremely large datasets from
telescopes.
o Weather modeling, mineral prospecting, flood prediction, etc., are other
important applications.
o Bioinformatics and astrophysics also present some of the most challenging
problems with respect to analyzing extremely large datasets.

SALIZA RAMLY - CSC580


(3) Commercial Applications
o Some of the largest parallel computers power the wall street!
o Data mining and analysis for optimizing business and marketing decisions.
o Large scale servers (mail and web servers) are often implemented using
parallel platforms.
o Applications such as information retrieval and search are typically powered by
large clusters.

SALIZA RAMLY - CSC580


(4) Applications in Computer Systems
o Network intrusion detection, cryptography, multiparty computations are some
of the core users of parallel computing techniques.
o Embedded systems increasingly rely on distributed control algorithms.
o A modern automobile consists of tens of processors communicating to
perform complex tasks for optimizing handling and performance.
o Conventional structured peer-to-peer networks impose overlay networks and
utilize algorithms directly from parallel computing.

SALIZA RAMLY - CSC580


Flynn's
Classical
Taxonomy
SALIZA RAMLY - CSC580
There are different ways to classify
parallel computers.
o One of the more widely used classifications, in use since 1966, is called Flynn's Taxonomy.
o Flynn's taxonomy distinguishes multi-processor computer architectures according to how they
can be classified along the two independent dimensions of Instruction Stream and Data
Stream. Each of these dimensions can have only one of two possible states: Single or Multiple.
o The following matrix defines the 4 possible classifications according to Flynn:

SALIZA RAMLY - CSC580


a) Single Instruction,
Single Data (SISD)
o A serial (non-parallel) computer
o Single Instruction: Only one instruction stream is being
acted on by the CPU during any one clock cycle
o Single Data: Only one data stream is being used as
input during any one clock cycle
o Deterministic execution
o This is the oldest type of computer
o Examples: older generation mainframes,
minicomputers, workstations and single processor/core
PCs.
SALIZA RAMLY - CSC580
b) Single Instruction,
Multiple Data (SIMD)
o A type of parallel computer
o Single Instruction: All processing units execute the
same instruction at any given clock cycle
o Multiple Data: Each processing unit can operate on a
different data element
o Best suited for specialized problems characterized by
a high degree of regularity, such as graphics/image
processing.
o Synchronous (lockstep) and deterministic execution
o Two varieties: Processor Arrays and Vector Pipelines
SALIZA RAMLY - CSC580
c) Multiple Instructions, Single Data
(MISD)
o A type of parallel computer
o Multiple Instructions: Each processing unit operates on
the data independently via separate instruction
streams.
o Single Data: A single data stream is fed into multiple
processing units.
o Few (if any) actual examples of this class of parallel
computer have ever existed.
o Some conceivable uses might be:
o multiple frequency filters operating on a single signal stream
o multiple cryptography algorithms attempting to crack a single
coded message.

SALIZA RAMLY - CSC580


d) Multiple Instructions, Multiple Data
(MIMD)
o A type of parallel computer
o Multiple Instruction: Every processor may be executing a different instruction
stream
o Multiple Data: Every processor may be working with a different data stream
o Execution can be synchronous or asynchronous, deterministic or non-
deterministic
o Currently, the most common type of parallel computer - most modern
supercomputers fall into this category.
o Examples: most current supercomputers, networked parallel computer
clusters and "grids", multi-processor SMP computers, multi-core PCs.
o Note: many MIMD architectures also include SIMD execution sub-components

SALIZA RAMLY - CSC580


Parallel
Computing
Terminology
SALIZA RAMLY - CSC580
Some General Parallel
Terminology
Like everything else, parallel computing has
its own "jargon".
Some of the more commonly used terms
associated with parallel computing are listed
below.
Most of these will be discussed in more detail
later.

SALIZA RAMLY - CSC580


Task Parallel Task Serial Execution
A logically discrete
section of
A task that can be
executed by multiple
Execution of a program
sequentially, one Parallel
computational work. A
task is typically a
program or program-
processors safely
(yields correct results)
statement at a time. In
the simplest sense, this
is what happens on a
Computing
like set of instructions
that is executed by a
processor.
one processor
machine. However,
virtually all parallel
Terminology
tasks will have sections
of a parallel program
that must be executed
serially.

SALIZA RAMLY - CSC580


Parallel Execution
• Execution of a program by more than one task, with each task being able
to execute the same or different statement at the same moment in time.

Shared Memory
• From a strictly hardware point of view, describes a computer
architecture where all processors have direct (usually bus based) access
Parallel
to common physical memory. In a programming sense, it describes a
model where parallel tasks all have the same "picture" of memory and
can directly address and access the same logical memory locations
Computing
regardless of where the physical memory actually exists.
Terminology
Distributed Memory
• In hardware, refers to network based memory access for physical
memory that is not common. As a programming model, tasks can only
logically "see" local machine memory and must use communications to
access memory on other machines where other tasks are executing.

SALIZA RAMLY - CSC580


• Parallel tasks typically need to exchange data. There are
several ways this can be accomplished, such as through a
Communications shared memory bus or over a network, however the actual
event of data exchange is commonly referred to as
communications regardless of the method employed.

Parallel
Computing
• The coordination of parallel tasks in real time, very often
associated with communications. Often implemented by
establishing a synchronization point within an application
Terminology
where a task may not proceed further until another task(s)
Synchronization reaches the same or logically equivalent point.
• Synchronization usually involves waiting by at least one task,
and can therefore cause a parallel application's wall clock
execution time to increase.

SALIZA RAMLY - CSC580


Granularity
• In parallel computing, granularity is a qualitative measure
of the ratio of computation to communication.
• Coarse: relatively large amounts of computational work
are done between communication events
• Fine: relatively small amounts of computational work are
Parallel
done between communication events
Computing
Observed Speedup
• Observed speedup of a code which has been parallelized,
Terminology
defined as:

• One of the simplest and most widely used indicators for a


parallel program's performance.

SALIZA RAMLY - CSC580


Parallel Overhead

• The amount of time required to coordinate parallel tasks, as


opposed to doing useful work. Parallel overhead can include
factors such as:
• Task start-up time
• Synchronizations
Parallel
• Data communications
• Software overhead imposed by parallel compilers, libraries,
Computing
tools, operating system, etc.
• Task termination time
Terminology
Massively Parallel

• Refers to the hardware that comprises a given parallel system


- having many processors. The meaning of many keeps
increasing, but currently BG/L pushes this number to 6 digits.

SALIZA RAMLY - CSC580


Scalability

• Refers to a parallel system's (hardware


and/or software) ability to demonstrate a
proportionate increase in parallel speedup
with the addition of more processors. Factors Parallel
that contribute to scalability include:
• Hardware - particularly memory-cpu
Computing
bandwidths and network communications Terminology
• Application algorithm
• Parallel overhead related
• Characteristics of your specific application
and coding

SALIZA RAMLY - CSC580


LECTURE2:

NEXT! PARALLEL
PLATFORMS
(PART 1)

SALIZA RAMLY - CSC580

You might also like