0% found this document useful (0 votes)

36 views

COA - Module-5

Uploaded by

shruthingowda48

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

COA - Module-5

Uploaded by

shruthingowda48

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Parallelism in Computer Architecture

1. Aim & objective:

To make students understand the basic structure and operation of digital
computer.
• Learn the concepts of parallel processing, pipelining.
• Understand the architecture and functionality of central processing
unit
• Discuss about different types of peripheral devices of computer
• Learn the different types of serial communication techniques.
• Explain different pipelining processes.
2. Prerequisite : Digital System Design, Microprocessors &
Microcontrollers

3. Pre Test- MCQtype

1. Execution of several activities at the same time.
a) Processing
b) parallel processing
c) serial processing
d) multitasking

Answer: parallel processing

2. A parallelism based on increasing processor word size.

a) Increasing
b) Count based
c) Bit based
d) Bit level

Answer: Bit level

3. The pipelining process is also called as ______

a) Superscalar operation
b) Assembly line operation
c) Von Neumann cycle
d) None of the mentioned

Answer: Assembly line operation

4. To increase the speed of memory access in pipelining, we make use of

_______
a) Special memory locations
b) Special purpose registers
c) Cache
d) Buffers

Answer: Cache

4. Parallelism

4.1 Introduction

Why Parallel Architecture?

 Parallel computer architecture adds a new dimension in the development of
computer system by using more and more number ofprocessors.
 In principle, performance achieved by utilizing large number of processors is
higher than the performance of a single processor at a given point oftime.

Parallel Processing
 Parallel processing can be described as a class of techniques which enables the
system to achieve simultaneous data-processing tasks to increase the
computational speed of a computersystem.
 A parallel processing system can carry out simultaneous data-processing to
achieve faster executiontime.
 For instance, while an instruction is being processed in the ALU component of the
CPU, the next instruction can be read frommemory.
 The primary purpose of parallel processing is to enhance the computer processing
capability and increase itsthroughput,
 A parallel processing system can be achieved by having a multiplicity of
functional units that perform identical or different operationssimultaneously.
 The data can be distributed among various multiple functionalunits.
 The following diagram shows one possible way of separating the execution unit
into eight functional units operating inparallel.
 The operation performed in each functional unit is indicated in each block if the
diagram:
 The adder and integer multiplier performs the arithmetic operation with integer
numbers.
 The floating-point operationsare separated into three circuits operating in parallel. 
 The logic, shift, and increment operations can be performed concurrently on
differentdata.
 All units are independent of each other, so one number can be shifted while
another number is beingincremented.
 Parallel computers can be roughly classified according to the level at which the
hardware supports parallelism, with multi-core and multi-processor computers
having multiple processing elements within a singlemachine.
 In some cases parallelism is transparent to the programmer, such as in bit-level or
instruction-levelparallelism.
 But explicitly parallel algorithms, particularly those that use concurrency, are more
difficult to write than sequential ones, because concurrency introduces several new
classesofpotentialsoftwarebugs,ofwhichraceconditionsarethemostcommon.
 Communication and synchronization between the different subtasks are typically
some of the greatest obstacles to getting optimal parallel programperformance.

Advantages of Parallel Computing over Serial Computing are as follows:

1. It saves time and money as many resources working together will reduce the time
and cut potentialcosts.
2. It can be impractical to solve larger problems on SerialComputing.
3. It can take advantage of non-local resources when the localresources are finite.
4. Serial Computing ‘wastes’ the potential computing power, thus Parallel
Computing makes better work ofhardware.

Types of Parallelism:
1. Bit-level parallelism: It is the form of parallel computing which is based on the
increasing processor’s size. It reduces the number of instructions that the system
must execute in order to perform a task on large-sized data.
Example: Consider a scenario where an 8-bit processor must compute the sum of
two 16-bit integers. It must first sum up the 8 lower-order bits, then add the 8
higher-order bits, thus requiring two instructions to perform the operation. A 16-
bit processor can perform the operation with just oneinstruction.
2. Instruction-level parallelism: A processor can only address less than one
instructionforeachclockcyclephase.Theseinstructionscanbere-orderedand
grouped which are later on executed concurrently without affecting the result of the
program. This is called instruction-levelparallelism.
3. Task Parallelism: Task parallelism employs the decomposition of a task into
subtasks and then allocating each of the subtasks for execution. The processors
perform execution of sub tasksconcurrently.
4. Data-level parallelism (DLP) – Instructions from a single stream operate
concurrently on several data – Limited by non-regular data manipulation
patterns and by memorybandwidth

Architectural Trends
 When multiple operations are executed in parallel, the number of cycles needed to
execute the program isreduced.
 However, resources are needed to support each of the concurrentactivities.
 Resources are also needed to allocate localstorage.
 The best performance is achieved by an intermediate action plan that uses
resources to utilize a degree of parallelism and a degree oflocality.
 Generally, the history of computer architecture has been divided into four
generations having following basic technologies−
 Vacuum tubes
 Transistors
 Integratedcircuits
 VLSI
 Till 1985, the duration was dominated by the growth in bit-levelparallelism.
 4-bit microprocessors followed by 8-bit, 16-bit, and soon.
 To reduce the number of cycles needed to perform a full 32-bit operation, the
widthofthedatapathwasdoubled.Lateron,64-bitoperationswereintroduced.
 The growth in instruction-level-parallelism dominated the mid-80s tomid-90s.
 The RISC approach showed that it was simple to pipeline the steps of instruction
processingsothatonanaverageaninstructionisexecutedinalmosteverycycle.
 Growth in compiler technology has made instruction pipelines more productive.
 In mid-80s, microprocessor-based computers consistedof
 An integer processingunit
 A floating-pointunit
 A cachecontroller
 SRAMs for the cachedata
 Tagstorage
 As chip capacity increased, all these components were merged into a singlechip.
 Thus, a single chip consisted of separate hardware for integer arithmetic, floating
point operations, memory operations and branchoperations.
 Other than pipelining individual instructions, it fetches multiple instructions at a
time and sends them in parallel to different functional units whenever possible.
This type of instruction level parallelism is called superscalarexecution.

FLYNN‘S CLASSIFICATION
 Flynn's taxonomy is a specific classification of parallel computer architectures that
are based on the number of concurrent instruction (single or multiple) and data
streams (single or multiple) available in thearchitecture.
 The four categories in Flynn's taxonomy are thefollowing:
1. (SISD) single instruction, singledata
2. (SIMD) single instruction, multipledata
3. (MISD) multiple instruction, singledata
4. (MIMD) multiple instruction, multipledata
 Instruction stream: is the sequence of instructions asexecuted by themachine
 Data Stream is a sequence of data including input, or partial or temporary result,
called by the instructionStream.
 Instructions are decoded by the control unit and then ctrl unit send the instructions
to the processing units for execution.•
 Data Stream flows between the processors and memory bidirectionally.
SISD
An SISD computing system is a uniprocessor machine which is capable of executing a
single instruction, operating on a single datastream.

 In SISD, machine instructions are processed in a sequential manner and computers

adopting this model are popularly called sequentialcomputers.
sequentialcomputers.
 Most conventionalcomp
uters have SISD architecture. All the instructionsand
data to be processed have to be stored in primary memory.
 The speed of the processing element in the SISD model is limited(dependent) by
the rate at which the computer can transfer informationinternally.
 Dominant representative SISD systems are IBM PC,workstations.

SIMD
• An SIMD system is a multiprocessor machine capable of executing the same
instruction on all the CPUs but operating on different datastreams

 Machines based on an SIMD model are well suited to scientific computing since
they involve lots of vector and matrixoperations.
 So that theinformation can be passed to all the processing elements (PEs)
organized data elements of vectors can be divided into multiple sets(N-sets for N PE
systems) and each PE can process one dataset.
 Dominant representative SIMD systems is Cray’s vector processingmachine.

MISD
 An MISD computing system is a multiprocessor machinecapable of executing
different instructions on different PEs but all of them operating on the same
dataset .
 The system performs different operations on the same data set. Machines built
using the MISD model are not useful in most of the application, a few machines
are built, but none of them are availablecommercially.

MIMD
 An MIMD system is a multiprocessor machine which is capable of executing
multiple instructions on multiple datasets.

 Each PE in the MIMD model has separate instruction and data strea m s; therefore
machines built usingthism odel are capable to any kind ofapplication.
 Unlike SIMD and MISD machines, PEs in MIMD mac h ines work
asynchronously.
dly categorized into
 MIMD machines arebroa
 shared-memoryMIMD and
 distributed-memoryMIMD
based on the way PEs are coupled to the main memory.

In the shared memory MIMD model (tightly coupled multiprocessor systems), all the
PEs are connected to a single global memory and they all have access to it. The
communication between PEs in this model takes place through the shared memory,
modification of the data stored in the global memory by one PE is visible to all other PEs.
Dominant representative shared memory MIMD systems are Silicon Graphics machines
and Sun/IBM’s SMP (SymmetricMulti-Processing).

In Distributed memory MIMD machines (loosely coupled multiprocessor systems) all

PEs have a local memory. The communication between PEs in this model takes place
through the interconnection network (the inter process communication channel, or IPC).
The network connecting PEs can be configured to tree, mesh or in accordance with
therequirement.



Pipelining
The term Pipelining refers to a technique of decomposing a sequential process into sub-operations,
with each sub-operation being executed in a dedicated segment that operates concurrently with all
other segments.

The most important characteristic of a pipeline technique is that several computations can be in
progress in distinct segments at the same time. The overlapping of computation is made possible by
associating a register with each segment in the pipeline. The registers provide isolation between each
segment so that each can operate on distinct data simultaneously.

The structure of a pipeline organization can be represented simply by including an input register for
each segment followed by a combinational circuit.

Let us consider an example of combined multiplication and addition operation to get a better
understanding of the pipeline organization.

The combined multiplication and addition operation is done with a stream of numbers such as:

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

The operation to be performed on the numbers is decomposed into sub-operations with each sub-
operation to be implemented in a segment within a pipeline.

The sub-operations performed in each segment of the pipeline are defined as:

R1 ← Ai, R2 ← Bi Input Ai, and Bi

R3 ← R1 * R2, R4 ← Ci Multiply, and input Ci
R5 ← R3 + R4 Add Ci to product

The following block diagram represents the combined as well as the sub-operations performed in
each segment of the pipeline.
Registers R1, R2, R3, and R4 hold the data and the combinational circuits operate in a particular
segment.

The output generated by the combinational circuit in a given segment is applied as an input register
of the next segment. For instance, from the block diagram, we can see that the register R3 is used as
one of the input registers for the combinational adder circuit.

Explain Pipeline with example

“a pipeline is a set of data processing elements connected in series, where the output of one
element is the input of the next one. The elements of a pipeline are often executed in
parallel or in time-sliced fashion. Pipelining is an implementation technique in which
multiple instructions are overlapped in execution.”

EXAMPLE: LAUNDRY EXAMPLE

The non-pipelined approach to laundry wouls be as follows
STAGES: 1. Place one dirty load of clothes in the washer.
2. When the washer is finished, place the wet load in the dryer.
3. When the dryer is finished, place the dry load on a table and fold.
4. When folding is finished, ask your roommate to put the clothes away.
If there are 4 people to use a single washing machine, the following diagram shows the
non-pipelined approach. Queue waiting time is long. Idle time of each stages are long between
usages.

The pipelined approach, overlapping of each stages without disturbing the other stages.
Now each person's waiting time is reduced and each stage idle time is reduced.

The same principles apply to processors where the pipeline instruction-execution is applied.
The MIPS instructions classically take five steps or stages:
1. IF: Instruction fetch from memory
2. ID: Instruction decode & register read
3. EX: Execute operation or calculate address
4. MEM: Access memory operand
5. WB: Write result back to register
All the pipeline stages take a single clock cycle, so the clock cycle must be long enough to
accommodate the slowest operation.
If the stages are perfectly balanced, then the time between instructions on the pipelined
processor—assuming ideal conditions—is equal to
Under ideal conditions and with a large number of instructions, the speed-up from
pipelining is approximately equal to the number of pipe stages; a five-stage pipeline is nearly
five times faster.

EXAMPLE : sequences of load Instruction (Non-pipelined approach)

Pipelined approach - sequences of load Instruction, but stages are overlapped during
execution as given below.

Total time taken for execution is 1400 ps, while in Non-pipelined approach is 2400 ps.
But as per formula
Time taken for pipelined approach = time taken (non-pipelined approach)/ No of stages
= 2400 ps / 5
= 480 ps
But the practical results show, it is 1400 ps.
So only when the No. of instructions in pipelined execution is high enough, the theoretical
execution speed can be achieved or nearly achieved. Pipelining improves performance by increasing
instruction throughput.
In general, the pipeline organization is applicable for two areas of computer design
which includes:

1. Arithmetic Pipeline
2. Instruction Pipeline
Array Processor: Architecture, Types, Working & Its

Applications
A supercomputer is a very powerful computer that includes architecture,
resources & components that gives a huge computing power to the
consumer. A supercomputer also contains a large number
of processors which performs millions or billions of computations each
second. So these computers can perform numerous tasks in a few
seconds. There are three types of supercomputers tightly connected cluster
computers that work together like a single unit. Commodity computers can
connect to low latency & high-bandwidth LANs and finally vector
processing computers which depend on an array processor or vectors. An
array processor is like a CPU that helps in performing mathematical
operations on various data elements. The most famous array processor is
the ILLIAC IV computer which is designed by the Burroughs Corporation.
This article discusses an overview of an array processor – working, types
& applications.

What is Array Processor?

A processor which is used to perform different computations on a huge

array of data is called an array processor. The other terms used for this
processor are vector processors or multiprocessors. This processor
performs only single instruction at a time on an array of data. These
processors work with huge data sets to execute computations. So, they are
mainly used for enhancing the performance of computers.

Array Processor Architecture

An array processor includes a number of ALUs (Arithmetic Logic Units)
which allows all the array elements to be processed together. Each ALU in
the processor is provided with local memory which is known as a
Processing Element or PE. The architecture of this processor is shown
below. By using this processor, a single instruction is issued through a
control unit & that instruction is simply applied to a number of data sets
simultaneously. By using a single instruction, a similar operation is
performed on an array of data which makes it suitable for vector
computations.
Array
Processor Architecture

The array processing architecture is known as a 2-dimensional array or

matrix. This architecture is implemented by the two-dimensional processor.
In this processor, the CPU issues a single instruction & after that, it is
applied to a no. of data simultaneously. This architecture mainly depends
on the fact that all data sets work on similar instructions, however, if these
data sets are reliant on each other, it is not achievable to apply parallel
processing. Thus these processors contribute efficiently & enhance the
processing speed as compared to the whole instructions.

Working of Array Processor

An array processor has an architecture mainly designed for processing
arrays of numbers. This processor architecture contains a number of
processors that works simultaneously, each handling one array element, so
that a single operation is applied to all the array elements in parallel. To get
the same effect within a conventional processor, the operation should be
applied to every array element sequentially and much more slowly.

This processor is a self-contained unit connected to the main computer

through an internal bus or an I/O port. This processor increases the overall
speed of instruction processing. These processors operate asynchronously
from the host CPU to improve the overall system capacity. This processor
is a very powerful tool that handles troubles with a high level of parallelism.
Types of Array Processor
There are two types of array processor like; attached and SIMD which is
discussed below.

Attached Array Processor

The auxiliary processor like the attached array processor is shown below.
This processor is simply connected to a computer for enhancing the
performance of a machine within numerical computational tasks. This
processor is connected to the General Purpose Computer through an I/O
interface and a local memory interface where both the memories like the
main & the local are connected. This processor achieves high performance
through parallel processing by multiple functional units.

Attached Array Processor

SIMD Array Processor

SIMD (‘Single Instruction and Multiple Data Stream’) processors is a
computers with several processing units which operate in parallel. These
processing units perform the same operation in synchronizing under the
supervision of the common control unit (CCU). The SIMD processor
includes a set of identical PEs (processing elements) where each PES has
a local memory.
SIMD Array Processor

This processor includes a master control unit and main memory. The
master control unit in the processor controls the operation of the processing
elements. And also, decodes the instruction & determines how the
instruction is executed. So, if the instruction is program control or scalar
then it is executed directly in the master control unit. Main memory is
mainly used to store the program while every processing unit uses
operands that are stored in its local memory.

Advantages
The advantages of an array processor include the following.

• Array processors improve the whole instruction processing speed.

• These processors run asynchronously from the host CPU the
overall capacity of the system is improved.
These processors include their own local memory that provides
extra memory to systems. So this is an important consideration for
the systems through a limited address space or physical memory.
• These processors simply perform computations on a huge array of
data.
• These are extremely powerful tools that help in handling troubles
with a high amount of parallelism.
• This processor includes a number of ALUs that permits all the
array elements to be processed simultaneously.
• Generally, the I/O devices of this processor-array system are very
efficient in supplying the required data to the memory directly.
• The main advantage of using this processor with a range of
sensors is a slighter footprint.
Applications
The applications of array processors include the following.
• This processor is used in medical and astronomy applications.
• These are very helpful in speech improvement.
• These are used in sonar and radar systems.
• These are applicable in anti-jamming, seismic exploration
& wireless communication.
• This processor is connected to a general-purpose computer to
improve the computer’s performance within arithmetic
computational tasks. So it attains high performance through
parallel processing by several functional units.

Cache Coherence Protocols in Multiprocessor System

In multiprocessor system where many processes needs a copy of same memory block, the
maintenance of consistency among these copies raises a problem referred to as Cache Coherence
Problem.
This occurs mainly due to these causes:-

• Sharing of writable data.

• Process migration.

• Inconsistency due to I/O.

Cache Coherence Protocols:

These are explained as following below:

1. MSI Protocol:
This is a basic cache coherence protocol used in multiprocessor system. The letters of protocol name
identify possible states in which a cache can be. So, for MSI each block can have one of the following
possible states:

• Modified –
The block has been modified in cache, i.e., the data in the cache is inconsistent with the
backing store (memory). So, a cache with a block in “M” state has responsibility to write the
block to backing store when it is evicted.

• Shared –
This block is not modified and is present in atleast one cache. The cache can evict the data
without writing it to backing store.

• Invalid –
This block is invalid and must be fetched from memory or from another cache if it is to be
stored in this cache.
2. MOSI Protocol:
This protocol is an extension of MSI protocol. It adds the following state in MSI protocol:

• Owned –
It indicates that the present processor owns this block and will service requests from other
processors for the block.

3. MESI Protocol –
It is the most widely used cache coherence protocol. Every cache line is marked with one the
following states:

• Modified –
This indicates that the cache line is present in current cache only and is dirty i.e its value is
different from the main memory. The cache is required to write the data back to main
memory in future, before permitting any other read of invalid main memory state.

• Exclusive –
This indicates that the cache line is present in current cache only and is clean i.e its value
matches the main memory value.

• Shared –
It indicates that this cache line may be stored in other caches of the machine.

• Invalid –
It indicates that this cache line is invalid.

4. MOESI Protocol:
This is a full cache coherence protocol that encompasses all of the possible states commonly used in
other protocols. Each cache line is in one of the following states:

• Modified –
A cache line in this state holds the most recent, correct copy of the data while the copy in the
main memory is incorrect and no other processor holds a copy.

• Owned –
A cache line in this state holds the most recent, correct copy of the data. It is similar to
shared state in that other processors can hold a copy of most recent, correct data and unlike
shared state however, copy in main memory can be incorrect. Only one processor can hold
the data in owned state while all other processors must hold the data in shared state.

• Exclusive –
A cache line in this state holds the most recent, correct copy of the data. The main memory
copy is also most recent, correct copy of data while no other holds a copy of data.

• Shared –
A cache line in this state holds the most recent, correct copy of the data. Other processors in
system may hold copies of data in shared state as well. The main memory copy is also the
most recent, correct copy of the data, if no other processor holds it in owned state.

• Invalid –
A cache line in this state does not hold a valid copy of data. Valid copies of data can be either
in main memory or another processor cache.
Clusters In Computer Organisation
A cluster is a set of loosely or tightly connected computers working together as a unified computing
resource that can create the illusion of being one machine. Computer clusters have each node set to
perform the same task, controlled and produced by the software.

Clustered Operating Systems work similarly to Parallel Operating Systems as they have many CPUs.
Cluster systems are created when two or more computer systems are merged. Basically, they have an
independent computer but have common storage and the systems work together.

The components of clusters are usually connected using fast area networks, with each node running
its own instance of an operating system. In most circumstances, all the nodes use the same hardware
and the same operating system, although in some setups different hardware or different operating
systems can be used in some setups.

In the field of computer organization, a cluster refers to a set of interconnected computers or servers
that collaborate to provide a unified computing resource. Clustering is an effective method to ensure
high availability, scalability, and fault tolerance in computer systems.

Clusters can be categorized into two major types, namely high-availability clusters and load-balancing
clusters. High-availability clusters guarantee uninterrupted service provision even when one or more
nodes fail. Multiple nodes are configured to provide redundant services, so that in case of failure,
another node takes over the failed node’s services without any interruption to the user. On the other
hand, load-balancing clusters distribute workloads among nodes in the cluster to ensure that no
single node is overburdened.

Several hardware and software technologies can be used to implement clusters, including dedicated
clustering hardware, virtualization technologies, and distributed software frameworks.

Clustering provides several benefits such as high availability, scalability, fault tolerance, and load
balancing. Nevertheless, there are a few challenges associated with clustering, such as complexity,
cost, and management.

To ensure the best performance on SEO, the content should contain relevant keywords and provide
valuable information to readers. It is important to avoid keyword stuffing and provide content that is
easy to read and understand.
For making cluster more efficient there exist two clusters:

• Hardware Cluster

• Software Cluster

Hardware Cluster helps in enable high-performance disk sharing between systems, while
the Software Cluster allows all systems to work together.

Types of Cluster Systems:

Primarily, there are two types of Cluster Systems:

• Asymmetric Cluster: In this type of clustering, all the nodes run the required applications,
and one node is in hot standby mode. The Hot standby node is used for monitoring the
server till it fails, when it fails then it takes its place.

• Symmetric Cluster: In this type of clustering, all the nodes run applications and monitor
other nodes at the same time. This clustering is more efficient than Asymmetric clustering as
it doesn’t have any hot standby key.

Classification of Clusters:
Computer Clusters are arranged together in such a way to support different purposes from general-
purpose business needs such as web-service support to computation-intensive scientific calculation.
Basically, there are three types of Clusters, they are:

• Load-Balancing Cluster – A cluster requires an effective capability for balancing the load
among available computers. In this, cluster nodes share a computational workload to
enhance the overall performance. For example- a high-performance cluster used for
scientific calculation would balance the load from different algorithms from the web-server
cluster, which may just use a round-robin method by assigning each new request to a
different node. This type of cluster is used on farms of Web servers (web farm).

• Fail-Over Clusters – The function of switching applications and data resources over from a
failed system to an alternative system in the cluster is referred to as fail-over. These types are
used to cluster database of critical mission, mail, file, and application servers

• High-Availability Clusters – These are also known as “HA clusters”. They offer a high
probability that all the resources will be in service. If a failure does occur, such as a system
goes down or a disk volume is lost, then the queries in progress are lost. Any lost query, if
retried, will be serviced by a different computer in the cluster. This type of cluster is widely
used in web, email, news, or FTP servers.

Benefits:

• Absolute scalability – It is possible to create large clusters that beats the power of even the
largest standalone machines. A cluster can have dozens of multiprocessor machines.

• Additional scalability – A cluster is configured in such a way that it is possible to add new
systems to the cluster in small increments. Clusters have the ability to add systems
horizontally. This means that more computers may be added to the clusters to improve their
performance, redundancy, and fault tolerance(the ability for a system to continue working
with malfunctioning of the node).

• High availability – As we know that each node in a cluster is a standalone computer, the
failure of one node does not mean loss of service. A single node can be taken down for
maintenance, while the rest of the clusters takes on a load of that individual node.

• Preferable price/performance – Clusters are usually set up to improve performance and

availability over single computers, while typically being much more cost-effective than single
computers of comparable speed or availability.

Features of clusters in computer organization:

High Performance: Clusters are designed to provide high performance computing by utilizing the
processing power of multiple computers working together.
Scalability: Clusters are scalable, which means that they can easily accommodate new nodes or
computers to increase processing power and performance.

Fault Tolerance: Clusters are designed to be fault-tolerant, which means that they can continue to
operate even if one or more nodes fail. This is achieved through redundant hardware, software, or
both.

Load Balancing: Clusters use load balancing techniques to distribute processing workload across
multiple nodes in a balanced manner. This helps to maximize performance and prevent overloading
of individual nodes.

Interconnectivity: Clusters are interconnected through a high-speed network that allows for efficient
communication and data transfer between nodes.

Shared Resources: Clusters allow for shared access to resources such as storage, memory, and
input/output devices. This makes it easier to manage resources and reduces the need for
duplication.

Cost-Effective: Clusters can be cost-effective compared to traditional high-performance computing

solutions. This is because they use commodity hardware and software, and can be built using open
source technology.

Versatility: Clusters can be used for a wide range of applications, including scientific computing, data
analysis, and web serving.

Advantages and Disadvantages

1. When considering a particular topic, it is important to evaluate the potential advantages and
disadvantages. This approach provides a comprehensive understanding of the topic, helping
to make informed decisions.

2. Advantages refer to the benefits or positive aspects of a particular topic. For instance, when
implementing a new technology, the advantages may include improved efficiency, cost
savings, and increased productivity. These benefits can positively impact individuals or
organizations, leading to increased profits, better customer service, or improved quality of
life.

3. On the other hand, disadvantages refer to the potential drawbacks or negative aspects of a
topic. When evaluating a new technology, for instance, the disadvantages may include higher
costs, reduced privacy, or possible security breaches. These drawbacks may negatively
impact individuals or organizations, leading to financial losses, negative customer
experiences, or reputational damage.

4. It is important to weigh both the advantages and disadvantages of a particular topic before
making a decision. This helps to ensure that the potential benefits outweigh the potential
drawbacks. Additionally, it is essential to keep in mind that the advantages and
disadvantages may differ depending on the individual or organization’s needs, goals, and
circumstances.

In summary, evaluating the advantages and disadvantages of a topic is a crucial step in decision-
making. By considering both the positive and negative aspects of a particular topic, individuals or
organizations can make informed decisions that align with their goals and values.

Complete Download Computer Organization and Design MIPS Edition The Hardware Software Interface Sixth Edition 6th Ed David A. Patterson PDF All Chapters
80% (5)
Complete Download Computer Organization and Design MIPS Edition The Hardware Software Interface Sixth Edition 6th Ed David A. Patterson PDF All Chapters
40 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Lab 2: Network Devices & Packet Tracer
No ratings yet
Lab 2: Network Devices & Packet Tracer
30 pages
Introduction To System Programming
100% (1)
Introduction To System Programming
50 pages
Advanced Unix Programming
From Everand
Advanced Unix Programming
Prof. N. B Venkateswarlu
No ratings yet
Comparative Study of DOS and NOS
No ratings yet
Comparative Study of DOS and NOS
22 pages
COA Unit - IV Notes
No ratings yet
COA Unit - IV Notes
25 pages
Aggregate Functions PPT DWI
No ratings yet
Aggregate Functions PPT DWI
12 pages
Software Engineering: Project Management & Estimation
No ratings yet
Software Engineering: Project Management & Estimation
61 pages
XML Intro
No ratings yet
XML Intro
30 pages
Distributed Systems Unit I
100% (1)
Distributed Systems Unit I
35 pages
8086 Signals
No ratings yet
8086 Signals
11 pages
Brief History of The X86 Family:: Evolution From 8080/8085 To 8086
No ratings yet
Brief History of The X86 Family:: Evolution From 8080/8085 To 8086
15 pages
1-IAS Architecture-12-12-2022
No ratings yet
1-IAS Architecture-12-12-2022
34 pages
Distributed File Systems
No ratings yet
Distributed File Systems
43 pages
System Models For Distributed and Cloud Computing
No ratings yet
System Models For Distributed and Cloud Computing
9 pages
EEF011 Computer Architecture 計算機結構: Exploiting Instruction-Level Parallelism with Software Approaches
0% (1)
EEF011 Computer Architecture 計算機結構: Exploiting Instruction-Level Parallelism with Software Approaches
40 pages
Memory Organization
No ratings yet
Memory Organization
99 pages
Computer Architecture - Memory System
100% (1)
Computer Architecture - Memory System
22 pages
Lecture 12 Stack and Subroutines
No ratings yet
Lecture 12 Stack and Subroutines
24 pages
Computer Architecture Syllabus
No ratings yet
Computer Architecture Syllabus
2 pages
OS PPT Introduction
No ratings yet
OS PPT Introduction
43 pages
CPU Scheduling
No ratings yet
CPU Scheduling
29 pages
Distributed File Systems
No ratings yet
Distributed File Systems
18 pages
Putnam Model: The Software Equation
100% (2)
Putnam Model: The Software Equation
5 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
Grid Architecture
No ratings yet
Grid Architecture
19 pages
Chapter 01 Introduction To Computer Organization and Architecture
No ratings yet
Chapter 01 Introduction To Computer Organization and Architecture
47 pages
Coa
No ratings yet
Coa
11 pages
Indexer
No ratings yet
Indexer
11 pages
Chapter 6 AJAX
No ratings yet
Chapter 6 AJAX
9 pages
Basic SNMP Labs
0% (1)
Basic SNMP Labs
10 pages
Superscalar Vs Superpipeline Processor
No ratings yet
Superscalar Vs Superpipeline Processor
17 pages
William Stallings Computer Organization and Architecture 8 Edition
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition
55 pages
Lecture 5 - Memory Management
No ratings yet
Lecture 5 - Memory Management
47 pages
Unit 5 - SE - Notes
No ratings yet
Unit 5 - SE - Notes
45 pages
Hash Data Structure
No ratings yet
Hash Data Structure
18 pages
Superscalar and Super Pipelined Processors
No ratings yet
Superscalar and Super Pipelined Processors
3 pages
RTOS Multitasking
No ratings yet
RTOS Multitasking
34 pages
7 Query Localization
No ratings yet
7 Query Localization
27 pages
UNDERSTANDING INTERNET I-BCA
No ratings yet
UNDERSTANDING INTERNET I-BCA
14 pages
SOLUTIONS That I Can Copy and PASTE Krypton - Fhda.edu - Mmurperfefhy - Cnet-53f - Resources - ISM Book Exercise Solutions
No ratings yet
SOLUTIONS That I Can Copy and PASTE Krypton - Fhda.edu - Mmurperfefhy - Cnet-53f - Resources - ISM Book Exercise Solutions
32 pages
ICS 2305 Systems Programming
No ratings yet
ICS 2305 Systems Programming
20 pages
DLD Lab Manual
No ratings yet
DLD Lab Manual
59 pages
Chapter 2 - Memory Management (Simple Systems)
No ratings yet
Chapter 2 - Memory Management (Simple Systems)
31 pages
Multiprocessor System Architecture
No ratings yet
Multiprocessor System Architecture
11 pages
Mobile Transport Layer
No ratings yet
Mobile Transport Layer
18 pages
System Models For Distributed and Cloud Computing
No ratings yet
System Models For Distributed and Cloud Computing
22 pages
Advanced Computer Networks - CS716 Power Point Slides Lecture 25
No ratings yet
Advanced Computer Networks - CS716 Power Point Slides Lecture 25
264 pages
Osi and Tcp/ip Model
No ratings yet
Osi and Tcp/ip Model
7 pages
Dsa Basic Data Structure
No ratings yet
Dsa Basic Data Structure
72 pages
2161CS136 Distributed Systems: Unit II Process and Distributed Objects Lecture No.10 Interprocess Communication
No ratings yet
2161CS136 Distributed Systems: Unit II Process and Distributed Objects Lecture No.10 Interprocess Communication
23 pages
List of Programs Subject Code: PCS-307 Subject: OOP Using C++ Programming Lab
No ratings yet
List of Programs Subject Code: PCS-307 Subject: OOP Using C++ Programming Lab
4 pages
Introduction To Programming in C++
No ratings yet
Introduction To Programming in C++
77 pages
Exp# 1a Fork System Call Aim: CS2257 Operating System Lab
No ratings yet
Exp# 1a Fork System Call Aim: CS2257 Operating System Lab
3 pages
Failure Classification in DBMS
No ratings yet
Failure Classification in DBMS
2 pages
CH-1 1 Pipelining
No ratings yet
CH-1 1 Pipelining
43 pages
Note On Operating System and Kernel
No ratings yet
Note On Operating System and Kernel
3 pages
Unit I Fundamentals of Computer Design and Ilp-1-14
No ratings yet
Unit I Fundamentals of Computer Design and Ilp-1-14
14 pages
Slides Chapter 5 Basic Processing Unit
No ratings yet
Slides Chapter 5 Basic Processing Unit
44 pages
Introduction to Internet & Web Technology: Internet & Web Technology
From Everand
Introduction to Internet & Web Technology: Internet & Web Technology
Dr. Yashpal singh
No ratings yet
COA Lesson Plan - Modified
No ratings yet
COA Lesson Plan - Modified
3 pages
Evolution of Microprocessor With Its History
No ratings yet
Evolution of Microprocessor With Its History
4 pages
SN8P2602C Sonix
No ratings yet
SN8P2602C Sonix
92 pages
Chapter 6 Parallel Processor
No ratings yet
Chapter 6 Parallel Processor
21 pages
An AnandTech Interview With Jim Keller - 'The Laziest Person at Tesla'
No ratings yet
An AnandTech Interview With Jim Keller - 'The Laziest Person at Tesla'
18 pages
Snip, A Statistics-Sensitive Background Treatment For The Quantitative
No ratings yet
Snip, A Statistics-Sensitive Background Treatment For The Quantitative
7 pages
Zainhaider COAL
No ratings yet
Zainhaider COAL
43 pages
Eecs112 hw1
No ratings yet
Eecs112 hw1
2 pages
FDio VPPwhitepaper July 2017
No ratings yet
FDio VPPwhitepaper July 2017
21 pages
RISC V VectorExtension 1 1
No ratings yet
RISC V VectorExtension 1 1
21 pages
The ARM Instruction Set: Advanced RISC Machines
No ratings yet
The ARM Instruction Set: Advanced RISC Machines
11 pages
Lab Pratice First Lab Manual
No ratings yet
Lab Pratice First Lab Manual
81 pages
Alif E7 Datasheet v2.5-1
No ratings yet
Alif E7 Datasheet v2.5-1
182 pages
Bca-Computer Organization and Architecture
No ratings yet
Bca-Computer Organization and Architecture
2 pages
Super Computers
No ratings yet
Super Computers
4 pages
Principles of Linear Pipelining
50% (2)
Principles of Linear Pipelining
71 pages
Array Processors
No ratings yet
Array Processors
16 pages
Instant download Advanced Computer Architecture Parallelism Scalability Programmability 2nd Edition Kai Hwang pdf all chapter
100% (9)
Instant download Advanced Computer Architecture Parallelism Scalability Programmability 2nd Edition Kai Hwang pdf all chapter
75 pages
Instant Access to (eBook PDF) Computer Organization and Design ARM Edition: The Hardware Software Interface ebook Full Chapters
100% (7)
Instant Access to (eBook PDF) Computer Organization and Design ARM Edition: The Hardware Software Interface ebook Full Chapters
56 pages
1.4-Parallel Computer Architecture
No ratings yet
1.4-Parallel Computer Architecture
22 pages
Bca Second Semester
No ratings yet
Bca Second Semester
14 pages
Difference Between Vector Processor and Scalar Processor
No ratings yet
Difference Between Vector Processor and Scalar Processor
1 page
Parallel and Distributed Computing
No ratings yet
Parallel and Distributed Computing
8 pages
ch09 Morris Mano
No ratings yet
ch09 Morris Mano
15 pages
L 1 ParallelProcess Challenges
No ratings yet
L 1 ParallelProcess Challenges
82 pages
Types of Microprocessors
No ratings yet
Types of Microprocessors
5 pages
MCSE-103 Advanced Computer Architecture
No ratings yet
MCSE-103 Advanced Computer Architecture
9 pages
Parallel Computing Unit 2 - Parallel Computing Architecture
No ratings yet
Parallel Computing Unit 2 - Parallel Computing Architecture
49 pages
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
No ratings yet
Computer Organization: - by Rama Krishna Thelagathoti (M.Tech CSE From IIT Madras)
118 pages

COA - Module-5

Uploaded by

COA - Module-5

Uploaded by

Parallelism in Computer Architecture

1. Aim & objective:

3. Pre Test- MCQtype

Answer: parallel processing

2. A parallelism based on increasing processor word size.

Answer: Bit level

3. The pipelining process is also called as ______

Answer: Assembly line operation

4. To increase the speed of memory access in pipelining, we make use of

Why Parallel Architecture?

Advantages of Parallel Computing over Serial Computing are as follows:

 In SISD, machine instructions are processed in a sequential manner and computers

In Distributed memory MIMD machines (loosely coupled multiprocessor systems) all

Ai* Bi + Ci for i = 1, 2, 3, ......., 7

R1 ← Ai, R2 ← Bi Input Ai, and Bi

Explain Pipeline with example

EXAMPLE: LAUNDRY EXAMPLE

EXAMPLE : sequences of load Instruction (Non-pipelined approach)

What is Array Processor?

A processor which is used to perform different computations on a huge

Array Processor Architecture

The array processing architecture is known as a 2-dimensional array or

Working of Array Processor

This processor is a self-contained unit connected to the main computer

Attached Array Processor

Attached Array Processor

SIMD Array Processor

• Array processors improve the whole instruction processing speed.

Cache Coherence Protocols in Multiprocessor System

• Sharing of writable data.

• Inconsistency due to I/O.

Cache Coherence Protocols:

Types of Cluster Systems:

Primarily, there are two types of Cluster Systems:

• Preferable price/performance – Clusters are usually set up to improve performance and

Features of clusters in computer organization:

Cost-Effective: Clusters can be cost-effective compared to traditional high-performance computing

Advantages and Disadvantages

You might also like