0% found this document useful (0 votes)

18 views

Unit V 2

The Cray Y-MP was a supercomputer sold by Cray Inc. in 1988 that used a vector processor design. It consisted of a small number of powerful vector processors that could each fetch operands, store values, and perform I/O simultaneously. Computation was divided among vector integer, floating point, and scalar integer units. The processors were connected via a multi-stage crossbar network to central memory. The DASH project at Stanford University aimed to build an experimental cache-coherent multiprocessor (CC-NUMA) with a two-level processor-to-memory interconnect. Within clusters of 4-16 processors, memory was accessed via a shared bus, while clusters were interconnected by a mesh network

Uploaded by

KULDEEP NARAYAN MINJ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

Unit V 2

Uploaded by

KULDEEP NARAYAN MINJ

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Vector

Parallel Cray
Y-MP
General Info
Cray Inc. - Cray Research , American supercomputer
manufacturer.
Cray Y-MP - supercomputer sold by it in 1988 as a
successor to its x-mp.
Vector Processor - a processor that executes one
instruction on a large number of data items with the
great deal of overlap.
Cray Y-MP
Consists of very small no.(upto 8)
very powerful vector processors.

Can be viewed as time-multiplexed

implementations of SIMD parallel
processing.

Classified as hybrid SIMD/MIMD

machines.
Cray Y-MP
- Each processor has 4 ports to access
central memory, with each port
delivering 128 bits/clock cycle
(4ns).
- Thus a CPU can fetch 2 operands(a
vector and a scalar), store 1 value
and perform I/O simultaneously.
The computation
section of CPU is
divided into 4
subsystems :
S
Vector Integer Unit ,
Floating point Unit &
Scalar Integer
Operations
- Performs vector integer operations
through separate functional units for
add/subtract, shift, logic and bit-
counting.
- Performs vector floating-point
operations through separate
functional units for
add/subtract,multiply and reciprocal
approximation.
- Scalar int. Operations performs all
As new data are being loaded into two registers
and emptied from a third one, other vector
registers can supply the operands and receive the
results of vector instructions.

Vector function units can be chained to allow the

next data-dependent vector computation to begin
before the current one has stored all of its results
in a vector register.

For example, a vector multiply–add operation

can be done by chaining of the floating-point
multiply and add units. This will cause the add
unit to begin its vector operation as soon as the
multiply unit has deposited its first result in a
vector register.
Processor to memory
interconnection of
Cray Y-MP
- A multi-staged crossbar network
built of 4X4 and 8X8 crossbar
switches and 1X8 demultiplexers.
- The network uses circuit switching
and ensures multiple access requests
from the same port are satisfied in
presentation order.
CC-Numa
Stanford
DASH
Introduction
Stanford University aimed at building an
experimental cache-coherent multiprocessor in their
directory architecture for shared memory(DASH)
project.
DASH can be classified as a cache-coherent
NUMA(CC-NUMA) architecture.
DASH has 2 level processor-to-memory
interconnection structure. Within a cluster
of 4-16 processors, memory is accessed
via shared bus.

Each processor in cluster has a :

1. A private instruction cache ( Write-

through policy)
2. A separate data cache( Write-
through policy)
3. A level-2 cache ( Write-back policy)
The clusters are inter-connected by a pair of
wormhole-routed 2-D mesh network :

1. A REQUEST mesh (which carries remote

memory access requests)
2. A REPLY mesh (which routes data and
acknowledgments back to the requesting
cluster)

DATA ACCESS LOCALITY leads to better

performance.

Inside a cluster, cache coherence is enforced by

snoopy protocol, while across cluster, it is
maintained by write-invalidate directory
protocol.

Unit of data sharing - block or cache-line

clusters are modified in minor ways and
augmented with two special boards that hold the
directory and network interface subsystems.

The processor board modifications consist of the

addition of a bus retry signal and provision of
masking capability for the bus arbiter. The retry
signal is used when a request involves service
from a remote node. The masking capability
allows the directory to hold off a processor’s retry
(via the bus arbiter) until the requested remote
access has been completed.

Thus, effectively, a split-transaction bus protocol

is used for performing remote accesses. The
added boards contain memory for the directory
entries, buffers, and a piece of the global
interconnection network.
Thank You!

Software Test Automation Mark Fewster PDF
No ratings yet
Software Test Automation Mark Fewster PDF
2 pages
48V Solar Battery Charger Circuit
No ratings yet
48V Solar Battery Charger Circuit
7 pages
Logic Level Converter TXS0108E
100% (2)
Logic Level Converter TXS0108E
16 pages
A. Nagoor Kani - 8086 Microprocessors and Its Applications-Mc Graw Hill India (2013)
100% (7)
A. Nagoor Kani - 8086 Microprocessors and Its Applications-Mc Graw Hill India (2013)
498 pages
CH 2 Vector Processing
No ratings yet
CH 2 Vector Processing
16 pages
15CS72 ACA Module1 Chapter1FinalCopy
No ratings yet
15CS72 ACA Module1 Chapter1FinalCopy
25 pages
Coa Unit-3,4 Notes
No ratings yet
Coa Unit-3,4 Notes
17 pages
Parallel Processors: Session 2
No ratings yet
Parallel Processors: Session 2
32 pages
PPC Unit 5 Question Bank and Answers
No ratings yet
PPC Unit 5 Question Bank and Answers
14 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
Chapter 8
No ratings yet
Chapter 8
59 pages
UNIT-V-Pipeline and Array Processing and Multi Processors
No ratings yet
UNIT-V-Pipeline and Array Processing and Multi Processors
51 pages
Unit 5
No ratings yet
Unit 5
89 pages
Parallel Computer Architecture A Hardware-Software
No ratings yet
Parallel Computer Architecture A Hardware-Software
18 pages
COME6102 Chapter 1 Introduction 2 of 2
No ratings yet
COME6102 Chapter 1 Introduction 2 of 2
8 pages
comporg6_ch12
No ratings yet
comporg6_ch12
36 pages
CP4253 Map Unit I
No ratings yet
CP4253 Map Unit I
31 pages
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
No ratings yet
Computer Architecture: Multiprocessors Shared Memory Architectures Prof. Jerry Breecher CSCI 240 Fall 2003
24 pages
module-4-chapter-2
No ratings yet
module-4-chapter-2
42 pages
Lec 6 SharedArch PDF
No ratings yet
Lec 6 SharedArch PDF
33 pages
Cache memory,Virtual memory and Auxiliary memory notes
No ratings yet
Cache memory,Virtual memory and Auxiliary memory notes
42 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Advance Computer Architecture2
No ratings yet
Advance Computer Architecture2
36 pages
CA Classes-201-205
No ratings yet
CA Classes-201-205
5 pages
Cache Memory,Virtual Memory and Auxiliary Memory Ppts Lecture (3)
No ratings yet
Cache Memory,Virtual Memory and Auxiliary Memory Ppts Lecture (3)
42 pages
Embedded Systems Group Assignment
No ratings yet
Embedded Systems Group Assignment
7 pages
Digital Electronics & Computer Organisation
No ratings yet
Digital Electronics & Computer Organisation
17 pages
8051 Arch
No ratings yet
8051 Arch
55 pages
Module 2 - Parallel Computing
No ratings yet
Module 2 - Parallel Computing
55 pages
Capsule Notes CO
No ratings yet
Capsule Notes CO
16 pages
Microcomputer Organization
No ratings yet
Microcomputer Organization
18 pages
What Is Parallel Computing
No ratings yet
What Is Parallel Computing
9 pages
Cis620 15 00
No ratings yet
Cis620 15 00
36 pages
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
No ratings yet
CH 04. Data-Level Parallelism in Vector, SIMD, and GPU Architectures
50 pages
Memory Interfacing Final for Sharing
No ratings yet
Memory Interfacing Final for Sharing
8 pages
8086 mp
No ratings yet
8086 mp
59 pages
Microcomputer Organization
No ratings yet
Microcomputer Organization
18 pages
Micro Presentation
No ratings yet
Micro Presentation
16 pages
10.introduction To Data-Parallel Architectures
No ratings yet
10.introduction To Data-Parallel Architectures
21 pages
Advanced Computer Architecture Assigment
No ratings yet
Advanced Computer Architecture Assigment
60 pages
Learn Registers and Flip Flops Cray Computer:: Mips Mips Mips
No ratings yet
Learn Registers and Flip Flops Cray Computer:: Mips Mips Mips
2 pages
Memory Organisation
No ratings yet
Memory Organisation
34 pages
Unit 4
No ratings yet
Unit 4
44 pages
S 8 Mod 1
No ratings yet
S 8 Mod 1
33 pages
Memory
No ratings yet
Memory
38 pages
Petros Niguse
No ratings yet
Petros Niguse
16 pages
CA-unit 5-Material-For Reference
No ratings yet
CA-unit 5-Material-For Reference
16 pages
Routing in Wireless Mesh Networks
From Everand
Routing in Wireless Mesh Networks
Raghav Kumar
No ratings yet
Embedded Systems Notes (Cse & It)
No ratings yet
Embedded Systems Notes (Cse & It)
148 pages
A Standard Microcomputer Consists of A Microprocessor (CPU), Buses, Memory, Parallel Input/output, Serial Input/output, Programmable I/O Interrupt, and Direct Memory Access DMA
No ratings yet
A Standard Microcomputer Consists of A Microprocessor (CPU), Buses, Memory, Parallel Input/output, Serial Input/output, Programmable I/O Interrupt, and Direct Memory Access DMA
14 pages
2. Parallel Computers
No ratings yet
2. Parallel Computers
39 pages
Chapter 3 Computer Architecture
No ratings yet
Chapter 3 Computer Architecture
64 pages
Comp A3 Solution
No ratings yet
Comp A3 Solution
5 pages
SIMD Computer Organizations
0% (1)
SIMD Computer Organizations
20 pages
Embedded Systems
No ratings yet
Embedded Systems
144 pages
Week_5
No ratings yet
Week_5
35 pages
Unit 5
No ratings yet
Unit 5
21 pages
L09-AddressTranslation
No ratings yet
L09-AddressTranslation
39 pages
ch12 - Memory Organization
No ratings yet
ch12 - Memory Organization
32 pages
Ca Part 4
No ratings yet
Ca Part 4
25 pages
Module 1 8051 Microcontroller 1
No ratings yet
Module 1 8051 Microcontroller 1
52 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Scanned by Camscanner
No ratings yet
Scanned by Camscanner
3 pages
Unit - IV
No ratings yet
Unit - IV
56 pages
Multiprocessor Architectures and Programming
No ratings yet
Multiprocessor Architectures and Programming
89 pages
CT-2 Paper 2016
No ratings yet
CT-2 Paper 2016
12 pages
2 (C) (I) Cambium-Pmp-450i-Access-Point PDF
No ratings yet
2 (C) (I) Cambium-Pmp-450i-Access-Point PDF
3 pages
Microinstruction Format Concept of Horizontal and Vertical Microprogramming
No ratings yet
Microinstruction Format Concept of Horizontal and Vertical Microprogramming
16 pages
SF31 - SF38: Z Ibo Seno Electronic Engineering Co., LTD
No ratings yet
SF31 - SF38: Z Ibo Seno Electronic Engineering Co., LTD
2 pages
TASK 2: TRIAC Characteristics Part I. NI Multisim Set-Up
No ratings yet
TASK 2: TRIAC Characteristics Part I. NI Multisim Set-Up
12 pages
Service Manual: SDM-X72
No ratings yet
Service Manual: SDM-X72
47 pages
0.7microoled ECX335SN
No ratings yet
0.7microoled ECX335SN
10 pages
Datasheet Processador MPC555 Usado Nos Coo
No ratings yet
Datasheet Processador MPC555 Usado Nos Coo
50 pages
Ug585 Zynq 7000 TRM
No ratings yet
Ug585 Zynq 7000 TRM
1,940 pages
FPGA-based SOC Verification
No ratings yet
FPGA-based SOC Verification
3 pages
DS2505 16Kb Add-Only Memory: Features Pin Assignment
No ratings yet
DS2505 16Kb Add-Only Memory: Features Pin Assignment
24 pages
Die Design - Introduction: Assignment 1
No ratings yet
Die Design - Introduction: Assignment 1
33 pages
16472caf-17bb-4933-b93e-3ce4db195e07
No ratings yet
16472caf-17bb-4933-b93e-3ce4db195e07
11 pages
TGAN80N60F2DS
No ratings yet
TGAN80N60F2DS
9 pages
Moduino X EN
No ratings yet
Moduino X EN
2 pages
CS3451 Os
No ratings yet
CS3451 Os
1 page
Modulo de Una o Dos Entradas KIDDE GSA-CT1-CT2 K85001-0241 - Input Modules PDF
No ratings yet
Modulo de Una o Dos Entradas KIDDE GSA-CT1-CT2 K85001-0241 - Input Modules PDF
4 pages
Thyristor
No ratings yet
Thyristor
2 pages
Sheet
No ratings yet
Sheet
19 pages
NCP1200A PWM Current Mode Controller For Universal Off Line Supplies Featuring Low Standby Power
No ratings yet
NCP1200A PWM Current Mode Controller For Universal Off Line Supplies Featuring Low Standby Power
16 pages
Low Power VLSI 2022-23
No ratings yet
Low Power VLSI 2022-23
1 page
O2 CRIME Spec
No ratings yet
O2 CRIME Spec
236 pages
Green Teal Futuristic Technology Presentation
No ratings yet
Green Teal Futuristic Technology Presentation
29 pages
Osy Unit 1
No ratings yet
Osy Unit 1
16 pages
Science Adh1174
No ratings yet
Science Adh1174
8 pages
LTM200KT08 V Samsung
No ratings yet
LTM200KT08 V Samsung
33 pages
4100 & 4100+ UT Service Instructions
No ratings yet
4100 & 4100+ UT Service Instructions
98 pages

Unit V 2

Uploaded by

Unit V 2

Uploaded by

Vector

Can be viewed as time-multiplexed

Classified as hybrid SIMD/MIMD

Vector function units can be chained to allow the

For example, a vector multiply–add operation

Each processor in cluster has a :

1. A private instruction cache ( Write-

1. A REQUEST mesh (which carries remote

DATA ACCESS LOCALITY leads to better

Inside a cluster, cache coherence is enforced by

Unit of data sharing - block or cache-line

The processor board modifications consist of the

Thus, effectively, a split-transaction bus protocol

You might also like