0% found this document useful (0 votes)
14 views

Pdc Digital Notes 6 17

The document provides an overview of Parallel and Distributed Computing, detailing their definitions, differences, types, applications, advantages, and disadvantages. It discusses the architecture of parallel computing, including implicit parallelism, microprocessor trends, and programming platforms, while also addressing challenges and issues in distributed systems. Additionally, it highlights the importance of security, software engineering, and social issues in the context of computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Pdc Digital Notes 6 17

The document provides an overview of Parallel and Distributed Computing, detailing their definitions, differences, types, applications, advantages, and disadvantages. It discusses the architecture of parallel computing, including implicit parallelism, microprocessor trends, and programming platforms, while also addressing challenges and issues in distributed systems. Additionally, it highlights the importance of security, software engineering, and social issues in the context of computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Professional Elective-IV

(R18A0530) Parallel and Distributed Computing


UNIT-I
Introduction: Scope, issues, applications and challenges of Parallel and Distributed Computing
Parallel Programming Platforms: Implicit Parallelism: Trends in Microprocessor Architectures, Dichotomy
of Parallel Computing Platforms, Physical Organization, co-processing.
Parallel Computing:
Parallel computing refers to the process of executing several processors an application or computation
simultaneously. Generally, it is a kind of computing architecture where the large problems break into
independent, smaller, usually similar parts that can be processed in one go. It is done by multiple CPUs
communicating via shared memory, which combines results upon completion. It helps in performing large
computations as it divides the large problem between more than one processor.
Distributed Computing
In distributed computing we have multiple autonomous computers which seems to the user as single system.
In distributed systems there is no shared memory and computers communicate with each other through
message passing. In distributed computing a single task is divided among different computers.
Difference between Parallel Computing and Distributed Computing:

S.No. Parallel Computing Distributed Computing


1 Many operations are performed System components are located at
simultaneously different locations

2 Uses multiple computers


Single computer is required

3 Multiple processors perform Multiple computers perform multiple


multiple operations operations

4 It may have shared or distributed It have only distributed memory


memory

5 Processors communicate with Computer communicate with each


each other through bus other through message passing.

6 Improves the system Improves system scalability, fault


performance tolerance and resource sharing
capabilities

2
Types of parallel computing

From the open-source and proprietary parallel computing vendors, there are generally three types of parallel
computing available, which are discussed below:

1. Bit-level parallelism: The form of parallel computing in which every task is dependent on processor word

size. In terms of performing a task on large-sized data, it reduces the number of instructions the processor

must execute. There is a need to split the operation into series of instructions. For example, there is an 8-bit

processor, and you want to do an operation on 16-bit numbers. First, it must operate the 8 lower-order bits

and then the 8 higher-order bits. Therefore, two instructions are needed to execute the operation. The

operation can be performed with one instruction by a 16-bit processor.

2. Instruction-level parallelism: In a single CPU clock cycle, the processor decides in instruction-

level parallelism how many instructions are implemented at the same time. For each clock cycle phase, a

processor in instruction-level parallelism can have the ability to address that is less than one instruction. The

software approach in instruction-level parallelism functions on static parallelism, where the computer decides

which instructions to execute simultaneously.

3. Task Parallelism: Task parallelism is the form of parallelism in which the tasks are decomposed into subtasks.

Then, each subtask is allocated for execution. And, the execution of subtasks is performed concurrently by

processors.

Applications of Parallel Computing

There are various applications of Parallel Computing, which are as follows:

o One of the primary applications of parallel computing is Databases and Data mining.
o The real-time simulation of systems is another use of parallel computing.
o The technologies, such as Networked videos and Multimedia.
o Science and Engineering.
o Collaborative work environments.
o The concept of parallel computing is used by augmented reality, advanced graphics, and virtual reality.

3
Advantages of Parallel computing

Parallel computing advantages are discussed below:

o In parallel computing, more resources are used to complete the task that led to decrease the time and
cut possible costs. Also, cheap components are used to construct parallel clusters.
o Comparing with Serial Computing, parallel computing can solve larger problems in a short time.
o For simulating, modeling, and understanding complex, real-world phenomena, parallel computing is
much appropriate while comparing with serial computing.
o When the local resources are finite, it can offer benefit you over non-local resources.
o There are multiple problems that are very large and may impractical or impossible to solve them on a
single computer; the concept of parallel computing helps to remove these kinds of issues.
o One of the best advantages of parallel computing is that it allows you to do several things in a time by
using multiple computing resources.
o Furthermore, parallel computing is suited for hardware as serial computing wastes the potential
computing power.

Disadvantages of Parallel Computing

There are many limitations of parallel computing, which are as follows:

o It addresses Parallel architecture that can be difficult to achieve.


o In the case of clusters, better cooling technologies are needed in parallel computing.
o It requires the managed algorithms, which could be handled in the parallel mechanism.
o The multi-core architectures consume high power consumption.
o The parallel computing system needs low coupling and high cohesion, which is difficult to create.
o The code for a parallelism-based program can be done by the most technically skilled and expert
programmers.
o Although parallel computing helps you out to resolve computationally and the data-exhaustive issue
with the help of using multiple processors, sometimes it affects the conjunction of the system and some
of our control algorithms and does not provide good outcomes due to the parallel option.
o Due to synchronization, thread creation, data transfers, and more, the extra cost sometimes can be quite
large; even it may be exceeding the gains because of parallelization.
o Moreover, for improving performance, the parallel computing system needs different code tweaking for
different target architectures.

4
Future of Parallel Computing

From serial computing to parallel computing, the computational graph has completely changed. Tech giant
likes Intel has already started to include multicore processors with systems, which is a great step towards
parallel computing. For a better future, parallel computation will bring a revolution in the way of working
the computer. Parallel Computing plays an important role in connecting the world with each other more than
before. Moreover, parallel computing's approach becomes more necessary with multi-processor computers,
faster networks, and distributed systems.

Parallel and distributed computing occurs across many different topic areas in computer science,
including algorithms, computer architecture, networks, operating systems, and software engineering.

Platform-based development

Platform-based development is concerned with the design and development of applications for specific types
of computers and operating systems (“platforms”). Platform-based development takes into account system-
specific characteristics, such as those found in Web programming, multimedia development, mobile application
development, and robotics.

Security and information assurance

Security and information assurance refers to policy and technical elements that protect information systems by
ensuring their availability, integrity, authentication, and appropriate levels of confidentiality. Information
security concepts occur in many areas of computer science, including operating systems, computer
networks, databases, and software.

Software engineering

Software engineering is the discipline concerned with the application of theory, knowledge, and practice to
building reliable software systems that satisfy the computing requirements of customers and users. It is
applicable to small-, medium-, and large-scale computing systems and organizations. Software engineering
uses engineering methods, processes, techniques, and measurements. Software development, whether done by
an individual or a team, requires choosing the most appropriate tools, methods, and approaches for a
given environment.

Social and professional issues

Computer scientists must understand the relevant social, ethical, and professional issues that surround their
activities. The ACM Code of Ethics and Professional Conduct provides a basis for personal responsibility and
professional conduct for computer scientists who are engaged in system development that directly affects the
general public.

Issues present in parallel and distributed paradigms:

Traditionally, distributed computing focused on resource availability, result correctness, code


portability and transparency of access to the resources more than on issues of efficiency and speed which, in
addition to scalability, are central to parallel computing.
What are the issues in distributed computing?

5
Issues in Distributed Systems
 The lack of global knowledge.
 Naming.
 Scalability.
 Compatibility.
 Process synchronization (requires global knowledge)
 Resource management (requires global knowledge)
 Security.
 Fault tolerance, error recovery.

Parallel computers can be classified according to the level at which the architecture supports parallelism, with
multi-core and multi-processor computers The paper proceeds by specifying key design issues of operating
system: like processes synchronization, memory management, communication, concurrency control.

Applications of parallel and distributed computing:

Applications of parallel computing:

Notable applications for parallel processing (also known as parallel computing) include

(1) computational astrophysics


(2) geo processing (or seismic surveying)
(3) climate modeling
(4) agriculture estimates
(5) financial risk management
(6) video color correction
(7) computational fluid dynamics
(8) medical imaging and drug discovery

Applications of Distributed computing:

What are the applications of distributed computing?

Social networks, mobile systems, online banking, and online gaming (e.g. multiplayer systems) also use
efficient distributed systems. Additional areas of application for distributed computing include e-learning
platforms, artificial intelligence, and e-commerce

6
Challenges of Parallel and distributed Systems:

What are the challenges of parallel and distributed computing?


Important concerns are workload sharing, which attempts to take advantage of access to multiple computers to
complete jobs faster; task migration, which supports workload sharing by efficiently distributing jobs among
machines; and automatic task replication, which occurs at different sites for greater reliability.
Challenges of distributed Systems
(1) Heterogeneity: The Internet enables users to access services and run applications over a heterogeneous
collection of computers and networks. ...
(2) Transparency: ...
(3) Openness. ...
(4) Concurrency. ...
(5) Security. ...
(6) Scalability. ...
(7) Failure Handling.

Parallel Programming Platforms: Implicit Parallelism: Trends in Microprocessor Architectures, Dichotomy of


Parallel Computing Platforms, Physical Organization, co-processing.

Scope of Parallelism
• Conventional architectures coarsely comprise of a processor, memory system, and the data path.
• Each of these components present significant performance bottlenecks.
• Parallelism addresses each of these components in significant ways.
• Different applications utilize different aspects of parallelism – e.g., data intensive applications utilize high
aggregate throughput, server applications utilize high aggregate network bandwidth, and scientific applications
typically utilize high processing and memory system performance.
• It is important to understand each of these performance bottlenecks.

Implicit Parallelism: Trends in Microprocessor Architectures

Microprocessor clock speeds have posted impressive gains over the past two decades (two to three orders of
magnitude)
Higher levels of device integration have made available a large number of transistors.
 The question of how best to utilize these resources is an important one.
 Current processors use these resources in multiple functional units and execute multiple instructions in
the same cycle.
 The precise manner in which these instructions are selected and executed provides impressive diversity
in architectures.

Pipelining and Superscalar Execution

7
• Pipelining overlaps various stages of instruction execution to achieve performance.

• At a high level of abstraction, an instruction can be executed while the next one is being decoded and the
next one is being fetched.

• This is akin to an assembly line for manufacture of cars

• Pipelining, however, has several limitations.

• The speed of a pipeline is eventually limited by the slowest stage.

• For this reason, conventional processors rely on very deep pipelines (20 stage pipelines in state-of-the-art
Pentium processors).

• However, in typical program traces, every 5-6th instruction is a conditional jump! This requires very accurate
branch prediction.

 The penalty of a mis prediction grows with the depth of the pipeline, since a larger number of instructions
will have to be flushed.
 One simple way of alleviating these bottlenecks is to use multiple pipelines.
 The question then becomes one of selecting these instructions.

Superscalar Execution Scheduling of instructions is determined by a number of factors:

• True Data Dependency: The result of one operation is an input to the next.

• Resource Dependency: Two operations require the same resource.

• Branch Dependency: Scheduling instructions across conditional branch statements cannot be done
deterministically a-priori.

• The scheduler, a piece of hardware looks at a large number of instructions in an instruction queue and selects
appropriate number of instructions to execute concurrently based on these factors.

• The complexity of this hardware is an important constraint on superscalar processors.

Very Long Instruction Word (VLIW) Processors

• The hardware cost and complexity of the superscalar scheduler is a major consideration in processor design.

• To address this issues, VLIW processors rely on compile time analysis to identify and bundle together
instructions that can be executed concurrently.

• These instructions are packed and dispatched together, and thus the name very long instruction word.

• This concept was used with some commercial success in the Multiflow Trace machine (circa 1984).

8
• Variants of this concept are employed in the Intel IA64 processors.

Very Long Instruction Word (VLIW) Processors: Considerations

• Issue hardware is simpler.

• Compiler has a bigger context from which to select coscheduled instructions.

• Compilers, however, do not have runtime information such as cache misses. Scheduling is, therefore,
inherently conservative.

• Branch and memory prediction is more difficult.

• VLIW performance is highly dependent on the compiler. A number of techniques such as loop unrolling,
speculative execution, branch prediction are critical.

• Typical VLIW processors are limited to 4-way to 8-way parallelism.

Dichotomy of Parallel Computing Platforms:

 An explicitly parallel program must specify concurrency and interaction between concurrent subtasks.
 The former is sometimes also referred to as the control structure and the latter as the communication
model.

Control Structure of Parallel Programs

• Parallelism can be expressed at various levels of granularity – from instruction level to processes.

• Between these extremes exist a range of models, along with corresponding architectural support.

• Processing units in parallel computers either operate under the centralized control of a single control unit or
work independently.

• If there is a single control unit that dispatches the same instruction to various processors (that work on
different data), the model is referred to as single instruction stream, multiple data stream (SIMD).

• If each processor has its own control control unit, each processor can execute different instructions on different
data items. This model is called multiple instruction stream, multiple data stream (MIMD).

9
SIMD and MIMD Processors

A typical SIMD architecture (a) and a typical MIMD architecture (b).

SIMD Processors

• Some of the earliest parallel computers such as the Illiac IV, MPP, DAP, CM-2, and MasPar MP-1 belonged
to this class of machines.

• Variants of this concept have found use in co-processing units such as the MMX units in Intel processors
and DSP chips such as the Sharc.

• SIMD relies on the regular structure of computations (such as those in image processing).

• It is often necessary to selectively turn off operations on certain data items. For this reason, most SIMD
programming paradigms allow for an “activity mask”, which determines if a processor should participate in
a computation or not.

10
MIMD Processors

• In contrast to SIMD processors, MIMD processors can execute different programs on different processors.

• A variant of this, called single program multiple data streams (SPMD) executes the same program on
different processors.

• It is easy to see that SPMD and MIMD are closely related in terms of programming flexibility and
underlying architectural support.

• Examples of such platforms include current generation Sun Ultra Servers, SGI Origin Servers,
multiprocessor PCs, workstation clusters, and the IBM SP.

SIMD-MIMD Comparison

• SIMD computers require less hardware than MIMD computers (single control unit).

• However, since SIMD processors ae specially designed, they tend to be expensive and have long design
cycles.

• Not all applications are naturally suited to SIMD processors.

• In contrast, platforms supporting the SPMD paradigm can be built from inexpensive off-the-shelf
components with relatively little effort in a short amount of time.

Communication Model of Parallel Platforms

• There are two primary forms of data exchange between parallel tasks – accessing a shared data space and
exchanging messages.

• Platforms that provide a shared data space are called shared address-space machines or multiprocessors.

• Platforms that support messaging are also called message passing platforms or multi computers.

 Shared-Address-Space Platforms

• Part (or all) of the memory is accessible to all processors.

• Processors interact by modifying data objects stored in this shared-address-space.

• If the time taken by a processor to access any memory word in the system global or local is identical, the
platform is classified as a uniform memory access (UMA), else, a non uniform memory access (NUMA)
machine.

 NUMA and UMA Shared-Address-Space Platforms:

11
• The distinction between NUMA and UMA platforms is important from the point of view of algorithm
design. NUMA machines require locality from underlying algorithms for performance. • Programming these
platforms is easier since reads and writes are implicitly visible to other processors.

• However, read-write data to shared data must be coordinated (this will be discussed in greater detail when
we talk about threads programming).

• Caches in such machines require coordinated access to multiple copies. This leads to the cache coherence
problem.

• A weaker model of these machines provides an address map, but not coordinated access. These models are
called non cache coherent shared address space machines.

 Shared-Address-Space vs. Shared Memory Machines

• It is important to note the difference between the terms shared address space and shared memory.

• We refer to the former as a programming abstraction and to the latter as a physical machine attribute.

• It is possible to provide a shared address space using a physically distributed memory.

Physical Organization of Parallel Platforms:

We begin this discussion with an ideal parallel machine called Parallel Random Access Machine, or PRAM.

Architecture of an Ideal Parallel Computer

• A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random
Access Machine, or PRAM.

• PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all
processors.

• Processors share a common clock but may execute different instructions in each cycle.

Depending on how simultaneous memory accesses are handled, PRAMs can be divided into four subclasses.

• Exclusive-read, exclusive-write (EREW) PRAM.

• Concurrent-read, exclusive-write (CREW) PRAM

• Exclusive-read, concurrent-write (ERCW) PRAM.

• Concurrent-read, concurrent-write (CRCW) PRAM.

12
• Common: write only if all values are identical.

• Arbitrary: write the data from a randomly selected processor.

• Priority: follow a predetermined priority order.

• Sum: Write the sum of all data items.

Physical Complexity of an Ideal Parallel Computer

• Processors and memories are connected via switches.

• Since these switches must operate in O(1) time at the level of words, for a system of p processors and m
words, the switch complexity is O (mp ).

• Clearly, for meaningful values of p and m, a true PRAM is not realizable.

13

You might also like