0% found this document useful (0 votes)

2 views

Chapter 03

Chapter 3 discusses various processor organizations such as mesh, binary trees, hypertrees, hypercubes, and more, focusing on their interconnection networks. It also covers Flynn's Taxonomy, categorizing computer architectures based on instruction and data streams, including SISD, SIMD, MISD, and MIMD. Additionally, the chapter explains the characteristics and examples of processor arrays, multiprocessors, and multicomputers, highlighting their configurations and performance metrics.

Uploaded by

abdallahm.alsoud

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Chapter 03

Uploaded by

abdallahm.alsoud

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 68

Chapter 3: Processor Arrays,

Multiprocessors, and
Multicomputers
 Processor Organizations
(interconnection networks)

Mesh

Binary Trees

Hypertrees

Hypercube

Pyramid

Butterfly

Cube-Connected Cycles

Shuffle Exhange

De Brujin
 Flynn’s Taxonomy
 Processor Arrays
 Multiprocessors
 Multicomputers
 Scaled Speedup & Parallelizability
Processor organization
(introduction network)
 We evaluate processors organizations
according to the following criteria:
a) Diameter: largest distance between two
nodes.
b) Bisection width: the minimum number of
edges that must be removed in order to
divide the network into two halves.
c) Degree: max # of edges per node.
d) Maximum edge length: we need it to
be a constant.
1) Mesh Networks
 The nodes are arranged into q-
dimensional lattice.
 Communication is allowed only
between neighboring nodes; hence
interior nodes communication with
2q other processors.
 The diameter of a q-dimensional
mesh with kq nodes is q(k-1)
…
 The bisection width of q-
dimensional mesh with kq nodes is
kq-1
 The max edges per node is 2q
 The max edge length is constant for
two & three dimensional mesh
 Ex: MMP, MasPar, Intel Paragon
XP/S
…

Mesh 4*4
…

Mesh with
warp
around in the
same row or
column
…

Mesh with
warp
around in
adjacent rows
or columns
Connectivity
 An interior processor Pi,j has a link
with the following processors:
 Pi,j+1
 Pi,j-1
 Pi+1,j
 Pi-1,j
Binary Tree Network
 It has 2k–1 nodes where k is the # of
levels.
 A node has at most three links.
 An interior node can communication
with its two children and its parent.
 The diameter is 2(k-1), which is low.
 It has a poor bisection width, which
is one.
…
Level 0 Root
Level 1
Internal
Level 2 node

Level 3 Leaves
 k =4 , # of levels

 # of nodes = 2k-1 = 24-1 =15

H.W: Quad Trees
Find out the
following?

a) # of nodes
b) Diameter
c) Bisection
width
d) Degree
Hypertree Networks
 It has a low diameter and an improved
bisection width, improvement on the
binary tree.
 A 4-ary hyper tree with depth d has 4d
node leaves and 2d (2d+1 -1) nodes in all.
 The diameter is 2d =4
 The bisection is 2d+1 =8
 The max degree is 6
 Ex: Connection Machine CM-5
…

Side view
Front view

H.W: Draw or construct a 3-dimensional

hypertree with d=2.
Ex: Hypertree

d=2
# of leave nodes=42=16
# of all nodes= 22(23-
Pyramid Network
 It is an attempt to obtain the
advantages of mesh & tree
networks.
 A pyramid of size k2 is complete 4-
ary rooted tree of height log2k with
additional interprocessor links so
that the processors in every tree
level form 2-D mesh.
Apex Level 2

Size=16=42 Level 1

Level 0
Base
…
 A pyramid of size k2 has its base a 2-D
mesh network of size k2.
 Total # of all processors is (4/3) k2-(1/3).
Ex: when k=4, (4/3)42-(1/3)= 21
 Every interior node is connected to 9
other nodes, 4+4+1=9.
 It has a low diameter over the mesh
which is 2 log k =4.
 It has a 2k bisection width.
Butterfly Network
0 1 2 3 4 5 6 7
Rank=0

Rank=1

Rank=2

Rank=3
…
 It consists of (k+1)2k nodes dived
into k+1 rows (ranks), each
containing
n= 2k nodes, the ranks are
labeled 0 through k.
 Ex: when k=3
(k+1)2k =4*8 =32 nodes
The diameter is 2k =6
The bisection is 2K=8
Connectivity
 Node (i,j) refers to the jth node on
the ith rank.
 Node (i,j) connected to two nodes
on rank i-1, node (i-1,j) & node (i-
1,m) where m is the integer found
by inverting the ith most significant
bit in the binary representation of
j.
Ex:
 Node (2,6) is connected to nodes
(1,4) and (1,6).
 How to get node (1,4)?
6  110  100 4  (1,4)
Hypercube Networks
 Also called binary n-cube.
 Consists of 2k nodes forming k-
dimensional hypercube.
 The nodes are labeled 0,1, … 2k-1;
two nodes are adjacent if their
labels differ exactly one bit
position.
…

0 00 10 000 010 110

100

1 01 11 001 011 101

111

K=1 K=2 K=3

N=2 N=4 N=8
Ex: k=3
4 2D 6

3D
3D
0 2D
2 1D
1D

1D 1D

5 2D 7

3D 3D
1 2D 3
…
 The diameter of a hypercube with
n=2k nodes is log(2k) = k.
 The bisection width is 2k-1=22=4.
 The degree is k.
 It is the most popular processor
organization
 Ex: nCUBE, Connection Machine CM-
200.
Cube-Connected Cycles
Network
 It is a k-dimensional hypercube whose 2k

vertices are actually cycles of k nodes.

 For each dimension, every cycle has a
node
connected to a node in the neighboring
cycle in that dimension.
 # of nodes is 2kk=8*3=24
Ex: N=24
P24 2 P26
P14 D P16
P10 1 P 1 P36
34 P22 D P12
D
P30 P20 2 P32 3
N=24 D D
Size is k2k= 3
3 3
3*8=24 DP P37
D 2 D
35 P27
D
P31 P111 P P25 P33 1P
D 15 2 D13
P
17

P21 D P23
Connectivity
 node (i,j) is connected to node
(i,m) if and only if m is the result of
inverting the ith most significant
bit of the binary representation.
 Ex: node (2,5) is connected to
node (2,7). How?
Invert the 2nd bit for j=5=101
which is 111 and that is 7.
…
 The degree is constant which is 3
(advantage over hypercube).
 The diameter is 2k, twice that of
the hypercube (disadvantage).
 The bisection width is 2k-1, which is
lower than that of the hypercube.
2k-1=22=4.
Shuffle-Exchange
Networks
 It consists of n=2k nodes,
numbered 0,1,…,n-1.
 It has two kinds of connections:
 Exchange: links pairs of nodes whose
number differ in their least significant
bit (bidirectional).
 Shuffle: links node i with node 2i mod
(n-1), with the exception that node n-
1 is connected to it self (direct).
Ex: k=3

0 1 2 3 4 5 6 7
00 00 01 01 10 10 11 11
0 1 0 1 0 1 0 1

Ex: Node 2 connected to node 3 through exchange,

and
node 2 connected to node (4 mod 7) = 4 through
shuffle.
…
 Connectivity using left cyclic shift.
Node ak-1 ak-2 .. a1a0 is connected to
node ak-2 .. a1a0 ak-1 using a shuffle.

 Ex: 001  010  100  001

and this is called a necklace.
Diameter=2k-1
Bisection width = 2k-1/k
De Brujin Networks
 It consist of n=2k.
 Let node ak-1 ak-2 .. a1a0 be a node,
then the two nodes reachable via
directed edges are:
ak-2 ak-3 .. a1a00
ak-2 ak-3 .. a1a01
Ex: k=3

001 011

000 010 101 111

100 110
…
 Diameter is k.
 # of edges per node is
constant.
 Bisection width for a network
with 2k nodes is 2k-1=22=4.
 Ex: Triton/1.
Flynn’s Taxonomy
 Bases his taxonomy on dual concepts of
instruction stream and data stream.
 An instruction stream is a sequence of
instructions performed by a computer.
 A data stream is a sequence of data
manipulated by an instruction stream.
 Categories depend on the multiplicity of
hardware used to manipulate
instruction and data stream.
Flynn’s Taxonomy
 SISD Single Instruction Single
Data.
 SIMD Single Instruction Multiple
Data.
 MISD Multiple Instruction Single
Data.
 MIMD Multiple Instruction Multiple
Data.
SISD
P

Sequential
machine
M
•One instruction in unit time
•Instruction execution may be pipelined
•Computer may have multiple functional unit,
but single control unit.
SISD : A Conventional Computer

Instructions
Data Input Processor
Processor Data Output

 Speed is limited by the rate at which computer can

transfer information internally.

Ex: PCs, Workstations

SIMD
P1

P2
Interconnection
CP
NW

Processor arrays
Ex: The Connection Machine CM-
200
SIMD Architecture
Instruction
Stream

Data Output
Data Input Processor stream A
stream A A
Data Output
Data Input Processor
stream B
stream B B
Processor Data Output
Data Input stream C
C
stream C

Ex: CRAY machine vector processing, Thinking machine cm*

Intel MMX (multimedia support)
SIMD cont…
 In this configuration, N processing elements are
connected via an interconnection network.
 Each processing element (PE) is a processor with
local memory.
 The PEs execute the instructions that are
distributed to the PEs by the CU via a broadcast
bus.
 Each PE then operates on data stored in its own
memory, and on data broadcast by the CU.
 Data is exchanged among PEs via a
unidirectional interconnection network, and the
I/O bus is used to transfer data from PEs to the
I/O interface and vice versa.
 To transfer results from particular PEs to the CU,
MISD
P1 P2 P3 P4 Pn

M
Systolic array or pipeline

More of an intellectual exercise than a practical

configuration. Few built, but commercially not
available
The MISD Architecture
Instruction
Stream A

Instruction
Stream B
Instruction Stream C
Processor
A Data
Output
Data Processor Stream
Input B
Stream
Processor
C

 More of an intellectual exercise than a practical configuration.

Few built, but commercially not available
MIMD
P1

P2
Interconnection
NW

Ex: nCUBE, CM-5, TC2000, Paragon XP/S

MIMD Architecture
Instruction Instruction Instruction
Stream A Stream B Stream C

Data Output
Data Input Processor stream A
stream A A
Data Output
Data Input Processor
stream B
stream B B
Processor Data Output
Data Input stream C
C
stream C

Shared memory (tightly coupled) MIMD

Distributed memory (loosely coupled) MIMD
Processor Arrays
 A vector computer: is a computer whose
instruction set includes operations on
vectors as well as scalars.
 Two ways to implement a vector
computer:
a) Pipeline Vector Processor
It streams vectors from memory to the
CPU, where pipelined arithmetic units
manipulate them.
Ex: CRAY-1, CYBER-205
…
b) Processor Array
it is a vector computer implemented
as a sequential computer connected
to a set of identical, synchronized
processing elements capable of
simultaneously performing the same
operation on different data.
Ex: CM-200, manufactured by
Thinking Machines Corporation.
Multiprocessors
 Multiple-CPU computer consist of # of
fully programmable processors, each
capable of executing its own program.
 Multiprocessors are multiple CPU
computers with a shared memory.
 Two types of shared memory:
 Uniform Memory Access (UMA).

 Non-Uniform Memory ACCESS

(NUMA).
UMA Multiprocessors
 The shared memory is centralized.
 All processors work through a
central switching mechanism to
reach a centralized shared memory.
 switching mechanism can be
 Common bus to global memory
 Crossbar switch
 Packet switched network
UMA
CPU CPU CPU

Switching Mechanism

Memory Banks I/O Devices

…
 Ex: Symmetry by Sequent
Computer System, Inc.
 Central problem is how to
ensure cache consistency
 Write through policy
 Copy back policy
NUMA Multiprocessors
 The shared memory is distributed.
 Ex: TC2000 by BBN Systems &
Technologies.
 Every processor has some nearby memory,
and the shared address space is formed by
combining these local memories.
 Time needed to access particular memory
depend on whether that location is local to
the processor.
Multicomputer
 The multi computer has no shared
memory. Each processor has its
own private memory and process
interaction occurs through
message-passing
 Ex: Paragon XP/S
nCUBE
Thinking Machine CM-5
 An important distinction between
early multicomputers (first
generation) and the second
generation multicomputers is how
processors communicate:
 Store-and-forward message passing
 Circuit-switched message routing
Store-and-Forward
 Message passing to send a message from
one processor to a nonadjacent processors,
every intermediate processor along the
message’s path must store the entire
message and then forward the message to
the next processor down the line.
 This means the CPU is interrupted every

time a transfer is initiated

 Ex: nCUBE/10,Intel iPSC,T800 Transputer
Circuit-Switched
 Every processor has a routing logic card
called the Direct-Connect-Module (DCM).
 The DCM set up a circuit from the source
node to the destination node, then the
message flows in a pipeline fashion from
the source node to the destination node
& none of the intermediate nodes
store the message.
 This way the CPUs for all
intermediate nodes are not
interrupted.
 Ex: iPSC/2, nCUBE2
 Advantage of the circuit-switching
message passing:-
a) no interrupt for CPU.
b) faster since you just switch the
message.
Scaled Speedup and
Parallelizability
 Speedup: The ratio between the time
taken by a parallel computer
executing the fastest serial algorithm
using one processor and the time
taken by the same parallel computer
executing the corresponding parallel
algorithm using p processors.
…
 Efficiency (cost): of a parallel algorithm
running on P processor is the speedup
divided by P.
 Parallelizability: The ratio between the
time taken by a parallel computer
executing a parallel algorithm on one
processor and the time taken by the
same parallel computer executing the
same parallel algorithm on P
processors.
Amdahl’s Law
 let f be the fraction of operations in
a computation that must be
performed sequentially,
0 f  where
1.
 Amdahl’s law states that the
maximum speedup achievable by
a parallel computer with p
processors performing the
computation is:
1
S
1 f
f
P
 This implies the following corollary:
a small # of sequential operations
can
significantly limit the speedup
achievable by a parallel computer.
Scaled Speedup
 The ratio between the time taken by a
sequential algorithm when it is
running on a single processor of a
parallel computer and the time taken
by the parallel algorithm on a parallel
machine.
Amdahl’s Effect
 It is a phenomenon which states that
speedup is an increasing function of
the problem size.
The Amdahl’s effect

n=100
Speedu 0
p
n=100

n=1
0

# of
processors
H.W
Read about the following:
1) CM-200

2) Symmetry

3) TC2000

4) nCUBE2

5) CM-5

6) Paragon XP/S

Past Paper 10015 21 - Q
No ratings yet
Past Paper 10015 21 - Q
20 pages
A Student's Guide to Python for Physical Modeling: Second Edition
From Everand
A Student's Guide to Python for Physical Modeling: Second Edition
Jesse M. Kinder
No ratings yet
ARM - PPT 8
100% (1)
ARM - PPT 8
74 pages
Lecture 4 Network Topologies For Parallel Architecture
No ratings yet
Lecture 4 Network Topologies For Parallel Architecture
34 pages
Lecture 5 Network Topologies for Parallel Architectures - Updated
No ratings yet
Lecture 5 Network Topologies for Parallel Architectures - Updated
46 pages
Parallel Processors: Session 5 Interconnection Networks
No ratings yet
Parallel Processors: Session 5 Interconnection Networks
48 pages
unit-3.2 static interconnection networks
No ratings yet
unit-3.2 static interconnection networks
10 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Advance Computer Architecture: Unit:Ii System Interconnect Architectures
No ratings yet
Advance Computer Architecture: Unit:Ii System Interconnect Architectures
53 pages
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
No ratings yet
Interconnection Networks: Crossbar Switch, Which Can Simultaneously Connect Any Set of
11 pages
Introduction
No ratings yet
Introduction
46 pages
week 2
No ratings yet
week 2
17 pages
Parallel Architecture
No ratings yet
Parallel Architecture
33 pages
05 Notes
No ratings yet
05 Notes
30 pages
Ca 2-1
No ratings yet
Ca 2-1
48 pages
Parallel Processing Lecture3
No ratings yet
Parallel Processing Lecture3
54 pages
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
No ratings yet
Lecture-27 Interconnection Networks+chapter-5 Slides-Version-2
70 pages
Interconnection Networks
No ratings yet
Interconnection Networks
31 pages
cs668 Lec1 ParallelArch
No ratings yet
cs668 Lec1 ParallelArch
18 pages
Parallel Algorithms: Peter Harrison and William Knottenbelt
No ratings yet
Parallel Algorithms: Peter Harrison and William Knottenbelt
65 pages
CS621 Final Term
No ratings yet
CS621 Final Term
111 pages
Static and Dynamic
No ratings yet
Static and Dynamic
43 pages
Interconnect Network Topologies
No ratings yet
Interconnect Network Topologies
23 pages
05 - Lecture #5 - 6
No ratings yet
05 - Lecture #5 - 6
42 pages
Interconnection Networks
No ratings yet
Interconnection Networks
40 pages
Slides Chapter 2 - Parallel Programming Platforms
No ratings yet
Slides Chapter 2 - Parallel Programming Platforms
33 pages
rohini_71721380822
No ratings yet
rohini_71721380822
13 pages
Interconnection Network Topology Design Trade-Offs
No ratings yet
Interconnection Network Topology Design Trade-Offs
29 pages
chapter 7 parallel processing
No ratings yet
chapter 7 parallel processing
29 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Oc23 Mpps
No ratings yet
Oc23 Mpps
30 pages
Parallel and Distributed Computing Research Paper
No ratings yet
Parallel and Distributed Computing Research Paper
8 pages
Using Interconnection Networks We Can
No ratings yet
Using Interconnection Networks We Can
96 pages
Chapter 2 - Parallel Programming Platforms
No ratings yet
Chapter 2 - Parallel Programming Platforms
33 pages
L2
No ratings yet
L2
27 pages
Aca Notes: Scalability
No ratings yet
Aca Notes: Scalability
13 pages
20_interconnectTopologiesFinal
No ratings yet
20_interconnectTopologiesFinal
47 pages
Network 34
No ratings yet
Network 34
76 pages
Lecture 4 Flynn's Classical Taxonomy
No ratings yet
Lecture 4 Flynn's Classical Taxonomy
43 pages
Parallel Architecture: Sathish Vadhiyar
No ratings yet
Parallel Architecture: Sathish Vadhiyar
26 pages
Lec3 InnerconnectionNetworks
No ratings yet
Lec3 InnerconnectionNetworks
28 pages
Ch-9 MIMD Architecture and SPMD
No ratings yet
Ch-9 MIMD Architecture and SPMD
8 pages
Multiprocessors Interconnection Networks
No ratings yet
Multiprocessors Interconnection Networks
32 pages
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
No ratings yet
Parallel Programming Platforms (Part 1) : CSE3057Y Parallel and Distributed Systems
38 pages
Lecture 4
No ratings yet
Lecture 4
33 pages
Lecture 5
No ratings yet
Lecture 5
72 pages
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
No ratings yet
Additional Topics of Unit-I and Unit-II: Syed Rameem Zahra
21 pages
Advanced Computer Architecture CSE 8383
No ratings yet
Advanced Computer Architecture CSE 8383
56 pages
SIMD Array Processor
No ratings yet
SIMD Array Processor
25 pages
Chapter 3
No ratings yet
Chapter 3
21 pages
Data Parallel Algorithms
No ratings yet
Data Parallel Algorithms
14 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Unit 3 Interconnection Network: Structure Page Nos
No ratings yet
Unit 3 Interconnection Network: Structure Page Nos
18 pages
1 Module 1 Introduction To Multiprocessors September 29 2024
No ratings yet
1 Module 1 Introduction To Multiprocessors September 29 2024
29 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
29 pages
Interconnection Networks: Dr. Rajender Nath
No ratings yet
Interconnection Networks: Dr. Rajender Nath
30 pages
CS621 FT highlighted by vaniza
No ratings yet
CS621 FT highlighted by vaniza
111 pages
Organization of Multiprocessor Systems
No ratings yet
Organization of Multiprocessor Systems
87 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Feynman Lectures Simplified 2C: Electromagnetism: in Relativity & in Dense Matter
From Everand
Feynman Lectures Simplified 2C: Electromagnetism: in Relativity & in Dense Matter
Robert Piccioni
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Build Switch and Logic Gates Using Transistors on the Breadboard
From Everand
Build Switch and Logic Gates Using Transistors on the Breadboard
GURUPRASAD N H
No ratings yet
Chapter 02
No ratings yet
Chapter 02
47 pages
CAL_CH6
No ratings yet
CAL_CH6
42 pages
CAL_CH1
No ratings yet
CAL_CH1
16 pages
ec-ch10part1
No ratings yet
ec-ch10part1
32 pages
ec-ch6
No ratings yet
ec-ch6
27 pages
payment_and_order_fulfillment
No ratings yet
payment_and_order_fulfillment
79 pages
VBHTP2e_01-Beta
No ratings yet
VBHTP2e_01-Beta
35 pages
Vbhtp2e 10 Beta
No ratings yet
Vbhtp2e 10 Beta
84 pages
VBHTP2e_20-Beta
No ratings yet
VBHTP2e_20-Beta
92 pages
A Framework For Fault Tolerance in RISC-V
No ratings yet
A Framework For Fault Tolerance in RISC-V
8 pages
SIMD Computer Organizations
0% (1)
SIMD Computer Organizations
20 pages
The ARM Instruction Set: Advanced RISC Machines
No ratings yet
The ARM Instruction Set: Advanced RISC Machines
58 pages
Computer Organization: Sandeep Kumar
No ratings yet
Computer Organization: Sandeep Kumar
117 pages
Arm_Cortex-X2_Core_Software_Optimization_Guide
No ratings yet
Arm_Cortex-X2_Core_Software_Optimization_Guide
65 pages
Last Class: Introduction To Operating Systems
No ratings yet
Last Class: Introduction To Operating Systems
14 pages
UNIT-4_Pipelining & Parallel processing
No ratings yet
UNIT-4_Pipelining & Parallel processing
34 pages
Searchable Computer Architecture Hwang Brigg Important
No ratings yet
Searchable Computer Architecture Hwang Brigg Important
864 pages
BVL306C Model QP 1
No ratings yet
BVL306C Model QP 1
2 pages
Resource-Efficient RISC-V Vector Extension Architecture For FPGA-based Accelerator
No ratings yet
Resource-Efficient RISC-V Vector Extension Architecture For FPGA-based Accelerator
8 pages
Syllabus of Computer Architecture CTEVT Nepal
No ratings yet
Syllabus of Computer Architecture CTEVT Nepal
3 pages
Comparison of Multimedia SIMD, GPUs and Vector
No ratings yet
Comparison of Multimedia SIMD, GPUs and Vector
13 pages
Qualcomm Hexagon: Programmer's Reference Manual
No ratings yet
Qualcomm Hexagon: Programmer's Reference Manual
691 pages
BootLoader I2C MSP430
50% (2)
BootLoader I2C MSP430
37 pages
OsChapter_6
No ratings yet
OsChapter_6
12 pages
Zainhaider COAL
No ratings yet
Zainhaider COAL
43 pages
COA_Question_Bank_updated_241214_205416
No ratings yet
COA_Question_Bank_updated_241214_205416
5 pages
Global Academy of Technology: Question Bank
100% (1)
Global Academy of Technology: Question Bank
6 pages
Specification and Design of Embedded Systems PDF
No ratings yet
Specification and Design of Embedded Systems PDF
216 pages
Pipeline and Vector Processing
No ratings yet
Pipeline and Vector Processing
48 pages
Changes Jumbo
No ratings yet
Changes Jumbo
295 pages
Array Processor
No ratings yet
Array Processor
6 pages
Pipelining and Vector Processing Chapter 9
100% (6)
Pipelining and Vector Processing Chapter 9
29 pages
ECE 6913 Fall 2022 Syllabus1
No ratings yet
ECE 6913 Fall 2022 Syllabus1
4 pages
Unit 5 - Pipeling and Multipoessors
No ratings yet
Unit 5 - Pipeling and Multipoessors
74 pages
aDSA SuperComp4Trng DNN
No ratings yet
aDSA SuperComp4Trng DNN
12 pages
A High-Level Simulator For The H.264/AVC Decoding Process in Multi-Core Systems
No ratings yet
A High-Level Simulator For The H.264/AVC Decoding Process in Multi-Core Systems
23 pages

Chapter 03

Uploaded by

Chapter 03

Uploaded by

Chapter 3: Processor Arrays,

 # of nodes = 2k-1 = 24-1 =15

H.W: Draw or construct a 3-dimensional

0 00 10 000 010 110

1 01 11 001 011 101

K=1 K=2 K=3

vertices are actually cycles of k nodes.

Ex: Node 2 connected to node 3 through exchange,

 Ex: 001  010  100  001

000 010 101 111

 Speed is limited by the rate at which computer can

Ex: PCs, Workstations

Ex: CRAY machine vector processing, Thinking machine cm*

More of an intellectual exercise than a practical

 More of an intellectual exercise than a practical configuration.

Ex: nCUBE, CM-5, TC2000, Paragon XP/S

Shared memory (tightly coupled) MIMD

 Non-Uniform Memory ACCESS

Memory Banks I/O Devices

time a transfer is initiated

You might also like