Chapter 06
Chapter 06
ARM
Chapter 6
Parallel Processors from
Client to Cloud
§6.1 Introduction
Introduction
Goal: connecting multiple computers
to get higher performance
Multiprocessors
Scalability, availability, power efficiency
Task-level (process-level) parallelism
High throughput for independent jobs
Parallel processing program
Single program run on multiple processors
Multicore microprocessors
Chips with multiple processors (cores)
Chapter 6 — Parallel Processors from Client to Cloud — 2
Hardware and Software
Hardware
Serial: e.g., Pentium 4
Parallel: e.g., quad-core Xeon e5345
Software
Sequential: e.g., matrix multiplication
Concurrent: e.g., operating system
Sequential/concurrent software can run on
serial/parallel hardware
Challenge: making effective use of parallel
hardware
half = 100;
repeat
synch();
if (half%2 != 0 && Pn == 0)
sum[0] = sum[0] + sum[half-1];
/* Conditional sum needed when half is odd;
Processor0 gets missing element */
half = half/2; /* dividing line on who sums */
if (Pn < half) sum[Pn] = sum[Pn] + sum[Pn+half];
until (half == 1);
8 × Streaming
processors
Bus Ring
N-cube (N = 3)
2D Mesh
Fully connected
Attainable GPLOPs/sec
= Max ( Peak Memory BW × Arithmetic Intensity, Peak FP Performance )