0% found this document useful (0 votes)

28 views

BDS Session 2

This document discusses parallel and distributed systems in the context of big data. It defines serial and parallel computing and explains how parallel computing uses multiple compute resources simultaneously by breaking problems into discrete parts that can be solved concurrently. It also discusses shared memory versus message passing architectures, data access strategies like replication and partitioning, and cluster computing. Finally, it covers topics like motivation for parallel systems, limits of parallelism, and Flynn's taxonomy for classifying systems based on instruction and data parallelism.

Uploaded by

Sudeb Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

BDS Session 2

Uploaded by

Sudeb Das

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 58

DSECL ZG 522: Big Data Systems

Session 2: Parallel and Distributed Systems

Janardhanan PS
[email protected]
Context

Big Data Systems use basic principles of Parallel and

Distributed Systems

2
Topics for today

• What are parallel / distributed systems

• Motivation for parallel / distributed systems
• Limits of parallelism
• Shared Memory vs Message Passing
• Data access strategies - Replication, Partitioning, Messaging
• Cluster computing

3
Serial Computing
• Software written for serial computation:
✓ A problem is broken into a discrete series of instructions
✓ Instructions are executed sequentially one after another
✓ Executed on a single processor
✓ Only one instruction may execute at any moment in time
✓ Single data stores - memory and disk

Extra info:

Von Neumann architecture : common memory store

and pathways between instructions and data - causes
Von Neumann bottleneck

Harvard architecture separates them to reduce

bottleneck.

Modern architectures use separate caches for

instruction and data.

4
Parallel Computing

• Simultaneous use of multiple compute resources to solve a computational problem

✓A problem is broken into discrete parts that can be solved concurrently
✓Each part is further broken down to a series of instructions
✓Instructions from each part execute simultaneously on different processors
✓Different processors can work with independent memory and storage
✓An overall control/coordination mechanism is employed

5
Spectrum of Parallelism

More coupling More granularity of parallelism

Think of a factory assembly line Message passing

over interconnect
Super-scalar Multi-core Shared memory
over interconnect Clusters Grids Clouds
Pipelining Multi-threaded

Pseudo-parallel
(intra-processor) Parallel

Distributed

Task level Task level Service /

Instruction level
Data level Data level Request level
Application level
More in session 5 on programming
6
Distributed Computing
• In distributed computing,
✓ Multiple computing resources are connected in network and computing tasks are distributed across these resources
✓ Results into increase in speed and efficiency of system
✓ Faster and more efficient than traditional methods of computing
✓ More suitable to process huge amounts of data in limited time

7
Multi-processor Vs Multi-computer systems

» Uniform Memory Access

Multiprocessor
» Shared memory address
space
» No common clock
» Fast interconnect

UMA NUMA

8
Multi-processor Vs Multi-computer systems

» Non Uniform Memory Access

Multicomputer
» May have shared address
spaces
» Typically message passing
» No common clock

NUMA NUMA
UMA

9
Interconnection Networks

a) A crossbar switch - faster

b) An omega switching network - cheaper

10
Classify based on Instruction and Data parallelism

Instruction Stream and Data Stream

• The term ‘stream’ refers to a sequence or flow of either instructions or data operated on by the
computer.
• In the complete cycle of instruction execution, a flow of instructions from main memory to the
CPU is established. This flow of instructions is called instruction stream.
• Similarly, there is a flow of operands between processor and memory bi-directionally. This flow
of operands is called data stream.

Reduce Von Neumann bottleneck with separate caches

11
Flynn’s Taxonomy
Instruction Streams

Single Multiple

SISD MISD
Single

Uniprocessors Uncommon
Pipelining Fault tolerance
Data Streams

SIMD MIMD
Multiple

Scientific computing Multi-computers

Matrix manipulations Distributed Systems

Image from sciencedirect.com

12
Some basic concepts ( esp. for programming in Big Data Systems )

» Coupling
» Tight - SIMD, MISD shared memory systems
» Loose - NOW, distributed systems, no shared memory
» Speedup
» how much faster can a program run when given N processors as opposed to 1 processor — T(1) / T(N)
» We will study Amdahl’s Law, Gustafson’s Law
» Parallelism of a program
» Compare time spent in computations to time spent for communication via shared memory or message passing
» Granularity
» Average number of compute instructions before communication is needed across processors
» Note:
» If coarse granularity, use distributed systems else use tightly coupled multi-processors/computers
» Potentially high parallelism doesn’t lead to high speedup if granularity is too small leading to high overheads

13
Comparing Parallel and Distributed Systems
Distributed System Parallel System

Independent, autonomous systems connected in a Computer system with several processing units
network accomplishing specific tasks attached to it

Coordination is possible between connected A common shared memory can be directly accessed
computers with own memory and CPU by every processing unit in a network

Loose coupling of computers connected in Tight coupling of processing resources that are used
network, providing access to data and remotely for solving single, complex problem
located resources

Programs have coarse grain parallelism Programs may demand fine grain parallelism

14
Motivation for parallel / distributed systems (1)
• Inherently distributed applications
• e.g. financial tx involving 2 or more parties
• Better scale in creating multiple smaller parallel tasks instead of a complex task
• e.g. evaluate an aggregate over 6 months data
• Processors getting cheaper and networks faster
• e.g. Processor speed 2x / 1.5 years, network traffic 2x/year, processors limited
by energy consumption
• Better scale using replication or partitioning of storage
• e.g. replicated media servers for faster access or shards in search engines replicated / partitioned storage
• Access to shared remote resources
• e.g. remote central DB
• Increased performance/cost ratio compared to special parallel systems
• e.g. search engine runs on a Network-of-Workstations

remote shared resource

15
Motivation for parallel / distributed systems (2)

• Motivation
Better reliabilityfor parallel
because / distributed
less chance systems (2)
of multiple failures
• Be careful about Integrity : Consistent state of a resource
cluster nodes
across concurrent access
• Incremental scalability
• Add more nodes in a cluster to scale up
• e.g. Clusters in Cloud services, autoscaling in AWS
resize cluster
• Offload computing closer to user for scalability and better
resource usage
• Edge computing server

Machine Learning at the Edge - DataScienceCentral.com

offload some error handling to edge

16
Example: Netflix

~700+ distributed micro-services and hardware, integrated with other vendors

reference: https://medium.com/refraction-tech-everything/how-netflix-works-the-hugely-simplified-complex-stuff-that-happens-every-time-you-hit-play-3a40c9be254b

17
Distributed network of content caching servers

This would be a P2P network if you were using bit torrent for free

18
Examples

In each of the following cases, what is the motivation for parallel /

distributed computing ?
1. Airline scheduling
2. Summary Statistics of Historic Sales Data of a Retailer
3. Web Crawler
4. Weather forecasting

19
Techniques for High Volume Data Processing
Method Description Usage
Cluster computing A collection of computers, homogenous Commonly used in Big Data
or heterogenous, using commodity Systems, such as Hadoop
components running open source or
proprietary software, communicating via
message passing

Massively Parallel Processing Typically proprietary Distributed Shared May be used in traditional Data
(MPP) Memory machines with integrated Warehouses, Data processing
storage appliances, e.g. EMC Greenplum
(postgreSQL on an MPP)

High-Performance Computing Known to offer high performance and Used to develop specialty and custom
(HPC) scalability by using in-memory scientific applications for research
computing where results is more valuable than
cost

20
Topics for today

• What are parallel / distributed systems

21
Limits of Parallelism

• A parallel program has some sequential / serial code and significant parallelized
code

22
Amdahl’s Law

• T(1) : Time taken for a job with 1 processor

• T(N) : Time taken for same job with N processors
• Speedup S(N) = T(1) / T(N)
• S(N) is ideally N when it is a perfectly parallelizable program, i.e. data parallel with no sequential
component
• Assume fraction of a program that cannot be parallelised (serial) is f and 1-f is parallel
✓ T(N) >= f * T(1) + (1-f) * T(1) / N
Only parallel portion is faster by N
• S(N) = T(1) / ( f * T(1) + (1-f) * T(1) / N )
• S(N) = 1 / ( f + (1-f) / N )
• Implication :
✓ If N=>inf, S(N) => 1/f
✓ The effective speedup is limited by the sequential fraction of the code

23
Amdahl’s Law - Example

10% of a program is sequential and there are 100 processors.

What is the effective speedup ?
S(N) = 1 / ( f + (1-f) / N )
S(100) = 1 / ( 0.1 + (1-0.1) / 100 )
= 1 / 0.109
= 9.17 (approx)

24
Limitations in speedup

Besides the sequential component of the program,

communication delays also result in reduction of speedup

A and B exchange messages in blocking mode

Say processor speed is 0.5ns / instruction
Say network delay one way is 10 us
For one message delay, A and B would have each executed
10us/0.5ns = 20000 instructions

+ context switching, scheduling, load balancing, I/O …

25
Super-linear speedup

Memory hierarchy / caches may increase the speedup

Single processor will have cache size D
But N processors will have total cache size N * D
So in a parallel environment the effect of cache may go up to N fold

Partitioning a data parallel program to run across multiple processors may lead to a better cache hit.*
Possible in some workloads with not much of other overheads, esp in embarrassingly parallel
applications.

* There could be also be cache coherency overheads.

26
Why Amdahl’s Law is such bad news

S(N) ~ 1/ f , for large N

Suppose 33% of a program is sequential

• Then even a billion processors won’t give a speedup over 3

• For the 256 cores to gain ≥100x speedup, we need

100 £ 1 / (f + (1-f)/256)
Which means f £ .0061 or 99.4% of the algorithm must be perfectly parallelizable !!

27
Speedup plot
256
Speedup for 1, 4, 16, 64, and 256 Processors
T1 / TN = 1 / (f + (1-f)/N)
192

128

0
0.00% 6.50% 13.00% 19.50% 26.00%

Percentage of Code that is Sequential

1 Processor 4 Processors 16 Processors 64 Processors 256 Processors

28
But wait - may be we are missing something

» The key assumption in Amdahl’s Law is total workload is fixed as #processors

is increased
» This doesn’t happen in practice — sequential part doesn’t increase with
resources
» Additional processors can be used for more complex workload and new age
larger parallel problems
» So Amdahl’s law under-estimates Speedup
» What if we assume fixed workload per processor

29
Gustafson-Barsis Law
Let W be the execution workload of the program before adding resources
f is the sequential part of the workload
So W = f * W + (1-f) * W
Let W(N) be larger execution workload after adding N processors
So W(N) = f * W + N * (1-f) * W
Parallelizable work can increase N times
The theoretical speedup in latency of the whole task at a fixed interval time T
S(N) = T * W(N) / T * W
= W(N) / W = ( f * W + N * (1-f) * W) / W
Remember this when we
S(N) = f + (1-f) * N discuss programming in Session 5
S(N) is not limited by f as N scales

So solve larger problems when you have more processors

30
Topics for today

• What are parallel / distributed systems

31
Memory access models
» Shared memory
» Multiple tasks on different processors access a common address
shared memory
space in UMA or NUMA architectures
» Conceptually easier for programmers
» Think of writing a voting algorithm - it is trivial because everyone is
in the same room, i.e. writing same variable P P P
» Distributed memory
» Multiple tasks – executing a single program – access data from
separate (and isolated) address spaces (i.e. separate virtual P/M P/M
memories)
» How will this remote access happen ? P/M

32
Shared Memory Model: Implications for Architecture

• A shared memory system has

✓ Physical memory (or memories) are
accessible by all processors
✓ The single (logical) address space of
each processor is mapped onto the
physical memory (or memories)
• A single program is implemented as a
collection of threads, with one or more
threads scheduled in a processor
• Conceptually easier to program

33
Distributed memory and message passing
• In a Distributed Memory model, data has to be moved
across Virtual Memories:
✓ i.e. a data item in VMem1 produced by task T1 has P1 P2
to be “communicated” to task T2 so that
✓ T2 can make a copy of the same in VMem2 and M1 M2
use it.

• Whereas in a Shared Memory model

✓ task T1 writes the data item into a memory
location and T2 can read the same
✓ how ? the memory location is part of the logical
address space that is shared between the tasks

34
Computing model for message passing

• Each data item must be located in one of the address spaces

✓ Data must be partitioned explicitly and placed (i.e. distributed)
✓ All interactions between processes require explicit communication i.e. passing of
messages
✓ Harder than shared memory from a programming abstraction standpoint
✓ Note: Shared memory abstractions can be theoretically created on message
passing systems

• In the simplest form:

✓ a sender (who has the data) and
✓ a receiver (who has to access the data)
• must co-operate for exchange of data

35
Communication model for message passing

• Processes operate within their own private address spaces

• Processes communicate by sending/receiving messages
✓ send: specifies recipient, buffer to be transmitted, and message identifier (“tag”)
✓ receive: specifies buffer to store data, and optional message identifier
✓ Sending messages is the only way to exchange data between processes 1 and 2

X
Process 2: Y
Process 1:
receive message
send local variable X
with id and store as
as message to P1
local variable Y
with id

36
Distributed Memory Model: Implications for Architecture

A distributed memory model is implemented in a distributed system where

✓ A collection of stand-alone systems are connected in a network
✓ A task runs as a collection of processes
✓ One or more processes are scheduled in each system / node
✓ Data exchanges happen via messages over the network
✓ Harder for programmers - hence need distributed OS / middleware layer to hide
some complexity *

* One can create a shared memory view using message passing and vice versa
37
Message Passing Model – Separate Address Spaces
Data
• Use of separate address spaces complicates programming
• But this complication is usually restricted to one or two phases:
✓ Partitioning the input data Processor A Processor B
✓ Improves locality - computation closer to data Data item X Data item Y
✓ Each process is enabled to access data from within its address
space, which in turn is likely to be mapped to the memory Processor C
hierarchy of the processor in which the process is running Data item Z
✓ Merging / Collecting the output data
✓ This is required if each task is producing outputs that have to be
combined Processor A
Data item X’
Data item Y’
Data item Z’

Remember granularity ?
We will see example of Hadoop map-reduce where data is partitioned, outputs
communicated over messages and merged to get final answer.

38
Message Passing Primitives

» Operations: Send and Receive

» Options:
» Blocking vs Non-Blocking
» Sync vs Async

» Important : A message can be received in the OS buffer but may not have been delivered
to application buffer. This is where a distributed message ordering logic can come in.

Image ref: https://cvw.cac.cornell.edu/mpip2p/SyncSend

39
Send-Receive Options

1 Send Sync Blocking Returns only after data is sent from kernel buffer. Easiest to
program but longest wait.
2 Send Async Blocking Returns after data is copied to kernel buffer but not sent.
Handle returned to check send status.
3 Send Sync Non-blocking Same as (2) but with no handle.

4 Send Async Non-blocking Returns immediately with handle. Complex to program but
minimum wait.
5 Receive Sync Blocking Returns after application gets data.

6 Receive Sync Non-blocking Returns immediately with data or handle to check status. More
efficient.

40
Topics for today

• What are parallel / distributed systems

41
Data Access Strategies: Partition
• Strategy:
✓Partition data – typically, equally – to the nodes of the (distributed) system
• Cost:
✓ Network access and merge cost when query needs to go across partitions
• Advantage(s):
✓ Works well if task/algorithm is (mostly) data parallel
✓ Works well when there is Locality of Reference within a partition
• Concerns
✓ Merge across data fetched from multiple partitions
✓ Partition balancing
✓ Row vs Columnar layouts - what improves locality of reference ?
✓ Will study shards and partition in Hadoop and MongoDB

42
Data Access Strategies: Replication

• Strategy:
✓ Replicate all data across nodes of the (distributed) system
• Cost:
✓ Higher storage cost
• Advantage(s):
✓ All data accessed from local disk: no (runtime) communication on the network
✓ High performance with parallel access
✓ Fail over across replicas
• Concerns
✓ Keep replicas in sync — various consistency models between readers and writers
✓ Will study in depth for MongoDB

43
Data Access Strategies: (Dynamic) Communication

• Strategy:
✓ Communicate (at runtime) only the data that is required
• Cost:
✓ High network cost for loosely coupled systems and data set to be exchanged is large
• Advantage(s):
✓ Minimal communication cost when only a small portion of the data is actually
required by each node
• Concerns
✓ Highly available and performant network
✓ Fairly independent parallel data processing

44
Data Access Strategies – Networked Storage

• Common Storage on the Network:

✓ Storage Area Network (for raw access – i.e. disk block access)
✓ Network Attached Storage (for file access)

• Common Storage on the Cloud:

✓ Use Storage as a Service
✓ e.g. Amazon S3

More in-depth coverage when studying Amazon storage case study

45
Topics for today

• What are parallel / distributed systems

46
Computer Cluster - Definition

• A cluster is a type of distributed processing system

✓ consisting of a collection of inter-connected
stand-alone computers
✓ working together as a single, integrated
computing resource
✓ e.g. web and application server cluster, cluster
running a Cloud service, DB server cluster

47
Cluster - Objectives

• A computer cluster is typically built for one of the following two reasons:
✓ High Performance - referred to as compute-clusters
✓ High Availability - achieved via redundancy

An off-the-shelf or custom load balancer, reverse proxy can be configured to serve the use case

• Question: How is this relevant for Big Data?

Hadoop nodes are a cluster for performance (independent Map/Reduce jobs are started on
multiple nodes) and availability (data is replicated on multiple nodes for fault tolerance)

Most Big Data systems run on a cluster configuration for performance and availability

48
Clusters – Peer to Peer computation

• Distributed Computing models can be classified as:

✓ Client Server models
✓ Peer-to-Peer models
• based on the structure and interactions of the nodes in a distributed system

• Clusters within the nodes use a Peer-to-Peer model of computation.

• There may be special control nodes that allocate and manage work thus having
a master-slave relationship.

49
Client-Server vs. Peer-to-Peer

• Client-Server Computation
✓ A server node performs the core computation – business logic in case of
applications
✓ Client nodes request for such computation
✓ At the programming level this is referred to as the request-response model
✓ Email, network file servers, …

• Peer-to-Peer Computation:
✓ All nodes are peers i.e. they perform core computations and may act as client or
server for each other.
✓ bit torrent, some multi-player games, clusters

50
Cloud and Clusters

• A cloud uses a datacenter as the infrastructure on top of which services are

provided
• e.g. AWS would have a datacenter in many regions - Mumbai, US east, …
(you can pick where you want your services deployed)
• A cluster is the basic building block for a datacenter:
✓ i.e. a datacenter is structured as a collection of clusters
• A cluster can host
✓ a multi-tenant service across clients - cost effective
✓ individual clients and their service(s) - dedicated instances

Can you draw a conceptual diagram to illustrate these cases ?

51
Motivation for using Clusters (1)

• Rate of obsolescence of computers is high

✓ Even for mainframes and supercomputers
✓ Servers (used for high performance computing) have to be replaced every 3 to 5
years.
• Solution: Build a cluster of commodity workstations
✓ Incrementally add nodes to the cluster to meet increasing workload
✓ Add nodes instead of replacing (i.e. let older nodes operate at a lower speed)
✓ This model is referred to as a scale-out cluster

52
Motivation for using Clusters (2)

• Scale-out clusters with commodity workstations as nodes are suitable for software
environments that are resilient:
✓ i.e. individual nodes may fail, but
✓ middleware and software will enable computations to keep running (and keep services
available) for end users
✓for instance, back-ends of Google and Facebook use this model.

• On the other hand, (public) cloud infrastructure is typically built as clusters of servers
✓ due to higher reliability of individual servers – used as nodes – (compared to that of
workstations as nodes).

53
Typical cluster components

Parallel applications

Parallel programming environment (e.g. map reduce)

Seq. Apps Cluster middleware (e.g. hadoop)

OS and runtimes OS and runtimes OS and runtimes

processor and memory

local storage
processor and memory
local storage
… processor and memory
local storage
network stack network stack network stack

cluster node

high speed switching network

54
Cluster Middleware - Some Functions

Single System Image (SSI) infrastructure High Availability (HA) Infrastructure

✓ Glues together OSs on all nodes to offer unified ✓ Cluster services for
access to system resources ✓ Availability
✓ Single process space ✓ Redundancy
✓ Cluster IP or single entry point
✓ Fault-tolerance
✓ Single auth
✓ Single memory and IO space
✓ Recovery from failures
✓ Process checkpointing and migration
✓ Single IPC space
✓ Single fs root
✓ Single virtual networking
✓ Single management GUI

http://www.cloudbus.org/papers/SSI-CCWhitePaper.pdf
55
Example cluster: Hadoop
• A job divided into tasks
• Considers every task either as a Map or a Reduce
• Tasks assigned to a set of nodes (cluster)
• Special control nodes manage the nodes for resource
management, setup, monitoring, data transfer, failover etc.
• Hadoop clients work with these control nodes to get the job done

56
Summary

• Motivation and classification of parallel systems

• Computing limits of speedup
• Message passing
• How replication, partitioning helps in Big Data storage and
access
• Cluster computing basics

57
Next Session:
Big Data Analytics and Systems

Outlier Analysis 2nd Edition Charu C. Aggarwal (Auth.) All Chapters Instant Download
100% (5)
Outlier Analysis 2nd Edition Charu C. Aggarwal (Auth.) All Chapters Instant Download
33 pages
Accounts Payable Notes
100% (7)
Accounts Payable Notes
12 pages
Post Office Challan - Railway Recruitment Board
No ratings yet
Post Office Challan - Railway Recruitment Board
1 page
BDS-Session-2
No ratings yet
BDS-Session-2
58 pages
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
No ratings yet
DSECL ZG 522: Big Data Systems: Session 2: Parallel and Distributed Systems
58 pages
BDS Session 2
No ratings yet
BDS Session 2
56 pages
PDS Merged
No ratings yet
PDS Merged
182 pages
RS_PDS-OE 3010
No ratings yet
RS_PDS-OE 3010
8 pages
CS439 CC 2 Parallel Distributed Systems[1]
No ratings yet
CS439 CC 2 Parallel Distributed Systems[1]
37 pages
HPC Overview
No ratings yet
HPC Overview
45 pages
Screenshot 2024-12-05 at 2.01.32 PM
No ratings yet
Screenshot 2024-12-05 at 2.01.32 PM
49 pages
hpc_parallel
No ratings yet
hpc_parallel
122 pages
Parallel and Distributed Computing Complete Notes
No ratings yet
Parallel and Distributed Computing Complete Notes
41 pages
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
No ratings yet
Lecture-13-14 Parallel and Distributed Systems Programming Models-Jameel
70 pages
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
No ratings yet
Multiprocessors - Parallel Processing Overview: "The Real World Is Inherently Concurrent Yet Our Computational
78 pages
PDC-3
No ratings yet
PDC-3
26 pages
24-25 - Parallel Processing PDF
No ratings yet
24-25 - Parallel Processing PDF
36 pages
Co-1 (2)
No ratings yet
Co-1 (2)
66 pages
Introduction
No ratings yet
Introduction
34 pages
Parallel Programming Module 1
No ratings yet
Parallel Programming Module 1
71 pages
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
100% (1)
Introduction To Parallel Programming: Linda Woodard CAC 19 May 2010
38 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
HPC BOOk
No ratings yet
HPC BOOk
68 pages
Cs3551 Distributed Computing Unit-1
No ratings yet
Cs3551 Distributed Computing Unit-1
52 pages
001__DDS-IIIT-Jan-10th
No ratings yet
001__DDS-IIIT-Jan-10th
34 pages
Unit VI Parallel Programming Concepts
No ratings yet
Unit VI Parallel Programming Concepts
90 pages
HPC Lectures 1 5
No ratings yet
HPC Lectures 1 5
18 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
Lecture-2-06.01.2025
No ratings yet
Lecture-2-06.01.2025
21 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
129 pages
PARALLEL VS DISTRIBUTED COMPUTING
No ratings yet
PARALLEL VS DISTRIBUTED COMPUTING
9 pages
PDA_1
No ratings yet
PDA_1
72 pages
Introduction To Parallel Computing LLNL
No ratings yet
Introduction To Parallel Computing LLNL
44 pages
Lec7 PDF
No ratings yet
Lec7 PDF
16 pages
U1&U2 PADCOM-25 (2)
No ratings yet
U1&U2 PADCOM-25 (2)
95 pages
chapter 1
No ratings yet
chapter 1
25 pages
unit1 2 and 3
No ratings yet
unit1 2 and 3
76 pages
Lect 1 Overview
No ratings yet
Lect 1 Overview
17 pages
Cloud Computing - Lecture 3
No ratings yet
Cloud Computing - Lecture 3
22 pages
cloud computing
No ratings yet
cloud computing
30 pages
Week1 - Parallel and Distributed Computing
100% (1)
Week1 - Parallel and Distributed Computing
46 pages
Unit1 Parallel and Distributed
No ratings yet
Unit1 Parallel and Distributed
29 pages
BCSE412L - Parallel Computing 01
No ratings yet
BCSE412L - Parallel Computing 01
27 pages
Basics of Parallel Programming: Unit-1
No ratings yet
Basics of Parallel Programming: Unit-1
79 pages
p1
No ratings yet
p1
30 pages
Parallel Computation Lecture Notes
No ratings yet
Parallel Computation Lecture Notes
44 pages
P 1
No ratings yet
P 1
44 pages
Course Outcome 1:: 15Cs4180 - Parallel Computing
No ratings yet
Course Outcome 1:: 15Cs4180 - Parallel Computing
23 pages
Parallel Computing: "Parallelization" Redirects Here. For Parallelization of Manifolds, See
No ratings yet
Parallel Computing: "Parallelization" Redirects Here. For Parallelization of Manifolds, See
20 pages
CS439-CC-2-Parallel Distributed Systems
No ratings yet
CS439-CC-2-Parallel Distributed Systems
37 pages
PDC Notes by Zatch-1
No ratings yet
PDC Notes by Zatch-1
42 pages
Unit1 Parallel and Distributed
No ratings yet
Unit1 Parallel and Distributed
21 pages
Parallel Programming: Sathish S. Vadhiyar Course Web Page
No ratings yet
Parallel Programming: Sathish S. Vadhiyar Course Web Page
36 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
Unit1 Parallel and Distributed
No ratings yet
Unit1 Parallel and Distributed
20 pages
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
No ratings yet
CS326 Parallel and Distributed Computing: SPRING 2021 National University of Computer and Emerging Sciences
47 pages
Project - ParallelComputing BSR v2
No ratings yet
Project - ParallelComputing BSR v2
40 pages
Week1-Parallel-and-Distributed-Computing
No ratings yet
Week1-Parallel-and-Distributed-Computing
55 pages
HPC Note
No ratings yet
HPC Note
39 pages
1-Introduction
No ratings yet
1-Introduction
48 pages
Storage Area Networks For Dummies
From Everand
Storage Area Networks For Dummies
Christopher Poelker
3.5/5 (2)
Designing Resilient Distributed Systems with CAP: Definitive Reference for Developers and Engineers
From Everand
Designing Resilient Distributed Systems with CAP: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Network File System in Practice: Definitive Reference for Developers and Engineers
From Everand
Network File System in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Coffee Day Beverages: by Nikhil Wagh
No ratings yet
Coffee Day Beverages: by Nikhil Wagh
17 pages
Solis Datasheet S2-PLC-CCO Global V3,4 2023 08
No ratings yet
Solis Datasheet S2-PLC-CCO Global V3,4 2023 08
2 pages
Ranma (Japanese
No ratings yet
Ranma (Japanese
11 pages
The Statement of Comprehensive Income Ifr
No ratings yet
The Statement of Comprehensive Income Ifr
20 pages
Examination
No ratings yet
Examination
1 page
Msds For Calgon PT TG en PDF
No ratings yet
Msds For Calgon PT TG en PDF
7 pages
Safecom Go Oce Administrators Manual 60715
No ratings yet
Safecom Go Oce Administrators Manual 60715
30 pages
Pi% Status
No ratings yet
Pi% Status
18 pages
Equity Carve Out
No ratings yet
Equity Carve Out
32 pages
Essar Power LTD BTL August 19
No ratings yet
Essar Power LTD BTL August 19
9 pages
Sub Netting Soal
100% (1)
Sub Netting Soal
6 pages
01 Material Costing
No ratings yet
01 Material Costing
15 pages
Classical Competing Risks M J Crowder pdf download
100% (1)
Classical Competing Risks M J Crowder pdf download
78 pages
Control Contacts Lecture PDF
No ratings yet
Control Contacts Lecture PDF
4 pages
Supreme Court: Malpractice As A Notary. - in 1941 or Five
No ratings yet
Supreme Court: Malpractice As A Notary. - in 1941 or Five
3 pages
ultimate-interview-prep-guide
No ratings yet
ultimate-interview-prep-guide
7 pages
501 1 Data Sheet PDF
No ratings yet
501 1 Data Sheet PDF
15 pages
A05!21!24-Kunal - Poster - For Wood Conference SB Edit
No ratings yet
A05!21!24-Kunal - Poster - For Wood Conference SB Edit
1 page
Pengaruh Teknik Budidaya Terhadap Produksi Kopi (Coffea Spp. L.) MASYARAKAT KARO
No ratings yet
Pengaruh Teknik Budidaya Terhadap Produksi Kopi (Coffea Spp. L.) MASYARAKAT KARO
16 pages
All Idc Training Directory S v18
No ratings yet
All Idc Training Directory S v18
416 pages
MODULE 4 Open Source
No ratings yet
MODULE 4 Open Source
11 pages
The Top 10 High-Demand Jobs With Attractive Salaries
No ratings yet
The Top 10 High-Demand Jobs With Attractive Salaries
54 pages
DRRR Chapter 1
No ratings yet
DRRR Chapter 1
17 pages
Lecture 2 - Analysis of Cables
No ratings yet
Lecture 2 - Analysis of Cables
25 pages
PDS - SikaSwell-P Profiles - en - 06.12.2010
No ratings yet
PDS - SikaSwell-P Profiles - en - 06.12.2010
5 pages
Unit 5 DMS
No ratings yet
Unit 5 DMS
16 pages
The Origins of CPM
No ratings yet
The Origins of CPM
34 pages

BDS Session 2

Uploaded by

BDS Session 2

Uploaded by

DSECL ZG 522: Big Data Systems

Session 2: Parallel and Distributed Systems

Big Data Systems use basic principles of Parallel and

• What are parallel / distributed systems

Von Neumann architecture : common memory store

Harvard architecture separates them to reduce

Modern architectures use separate caches for

• Simultaneous use of multiple compute resources to solve a computational problem

More coupling More granularity of parallelism

Think of a factory assembly line Message passing

Task level Task level Service /

» Uniform Memory Access

» Non Uniform Memory Access

a) A crossbar switch - faster

Instruction Stream and Data Stream

Reduce Von Neumann bottleneck with separate caches

Scientific computing Multi-computers

Image from sciencedirect.com

remote shared resource

Machine Learning at the Edge - DataScienceCentral.com

~700+ distributed micro-services and hardware, integrated with other vendors

In each of the following cases, what is the motivation for parallel /

• What are parallel / distributed systems

• T(1) : Time taken for a job with 1 processor

10% of a program is sequential and there are 100 processors.

Besides the sequential component of the program,

A and B exchange messages in blocking mode

+ context switching, scheduling, load balancing, I/O …

Memory hierarchy / caches may increase the speedup

* There could be also be cache coherency overheads.

S(N) ~ 1/ f , for large N

Suppose 33% of a program is sequential

• For the 256 cores to gain ≥100x speedup, we need

Percentage of Code that is Sequential

1 Processor 4 Processors 16 Processors 64 Processors 256 Processors

» The key assumption in Amdahl’s Law is total workload is fixed as #processors

So solve larger problems when you have more processors

• What are parallel / distributed systems

• A shared memory system has

• Whereas in a Shared Memory model

• Each data item must be located in one of the address spaces

• In the simplest form:

• Processes operate within their own private address spaces

A distributed memory model is implemented in a distributed system where

» Operations: Send and Receive

Image ref: https://cvw.cac.cornell.edu/mpip2p/SyncSend

• What are parallel / distributed systems

• Common Storage on the Network:

• Common Storage on the Cloud:

More in-depth coverage when studying Amazon storage case study

• What are parallel / distributed systems

• A cluster is a type of distributed processing system

• Question: How is this relevant for Big Data?

• Distributed Computing models can be classified as:

• Clusters within the nodes use a Peer-to-Peer model of computation.

• A cloud uses a datacenter as the infrastructure on top of which services are

Can you draw a conceptual diagram to illustrate these cases ?

• Rate of obsolescence of computers is high

Parallel programming environment (e.g. map reduce)

Seq. Apps Cluster middleware (e.g. hadoop)

OS and runtimes OS and runtimes OS and runtimes

processor and memory

high speed switching network

Single System Image (SSI) infrastructure High Availability (HA) Infrastructure

• Motivation and classification of parallel systems

You might also like