0% found this document useful (0 votes)
65 views

CS 134: Operating Systems: Multiprocessing

Threads are often related - Schedule independently or together? - Completely independent: job completion is slowest thread - Hardware cache coherency introduces challenges for spinlocks on SMP systems - NUMA systems benefit from allocating process memory locally and giving processes CPU affinity
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

CS 134: Operating Systems: Multiprocessing

Threads are often related - Schedule independently or together? - Completely independent: job completion is slowest thread - Hardware cache coherency introduces challenges for spinlocks on SMP systems - NUMA systems benefit from allocating process memory locally and giving processes CPU affinity
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

CS34

2013-05-17
CS 134:
Operating Systems
Multiprocessing

CS 134:
Operating Systems
Multiprocessing

1 / 13
Overview CS34 Overview

2013-05-17
Multiprocessing Designs

OS Implications

Overview Programming Models

Other Issues

Multiprocessing Designs

OS Implications

Programming Models

Other Issues

2 / 13
Multiprocessing Designs

SIMD and MIMD CS34 SIMD and MIMD

2013-05-17
Multiprocessing Designs Multiple CPUs come in several flavors:

SIMD: Single Instruction, Multiple Data


I Also called vector processor
I Sample instruction: a[i] = b[i] + c[i] for i in small

SIMD and MIMD I


range (e.g., 0-3)
Canonical example: GPUs

MIMD: Multiple Instruction, Multiple Data


I.e., 2 or more (semi-)independent CPUS

Multiple CPUs come in several flavors:


We won’t talk further about SIMD; from an OS point of view it’s just
SIMD: Single Instruction, Multiple Data another CPU.

I Also called vector processor


I Sample instruction: a[i] = b[i] + c[i] for i in small
range (e.g., 0-3)
I Canonical example: GPUs

MIMD: Multiple Instruction, Multiple Data


I.e., 2 or more (semi-)independent CPUS

3 / 13
Multiprocessing Designs

MIMD Approaches CS34 MIMD Approaches

2013-05-17
Multiprocessing Designs MIMD can be:
I Several chips or cores, (semi-)private memories, able to
access each other’s memory (NUMA—Non-Uniform Memory
Access)

MIMD Approaches I Several chips or cores, one memory (SMP—Symmetric


Multiprocessing)
I Several boxes (possibly each SMP or NUMA) connected by
network (distributed system)

MIMD can be:


I Several chips or cores, (semi-)private memories, able to
access each other’s memory (NUMA—Non-Uniform Memory
Access)
I Several chips or cores, one memory (SMP—Symmetric
Multiprocessing)
I Several boxes (possibly each SMP or NUMA) connected by
network (distributed system)

4 / 13
OS Implications

NUMA Issues CS34 NUMA Issues

2013-05-17
OS Implications
NUMA means processes access local memory faster
⇒ Allocate process memory on local CPU

NUMA Issues ⇒ Processes should have “CPU affinity”

NUMA means processes access local memory faster


⇒ Allocate process memory on local CPU
⇒ Processes should have “CPU affinity”

5 / 13
OS Implications

SMP Issues CS34 SMP Issues

2013-05-17
OS Implications SMPs still have caches

Introduces cache coherency problems:


I Processor 0 uses compare-and-swap to set a lock nonzero
I Write goes into local cache for speed

SMP Issues I Processor 1 reads lock from own cache, sees it’s still zero. . .

Cure: hardware coherency guarantees


. . . but spinlocks now have super-high costs
I May be better to do thread switch

SMPs still have caches


Thread switch is high cost, but may be cheaper than spinlock.
Introduces cache coherency problems:
I Processor 0 uses compare-and-swap to set a lock nonzero
I Write goes into local cache for speed
I Processor 1 reads lock from own cache, sees it’s still zero. . .

Cure: hardware coherency guarantees


. . . but spinlocks now have super-high costs
I May be better to do thread switch

6 / 13
OS Implications

SMP Scheduling CS34 SMP Scheduling

2013-05-17
OS Implications
Threads are often related
I Schedule independently or together?
I Completely independent: job completion is slowest thread

SMP Scheduling I

I
Together: some CPUs may be wasted on waiting for events
Always good to keep thread x on same CPU (because cache
is filled)

Threads are often related


I Schedule independently or together?
I Completely independent: job completion is slowest thread
I Together: some CPUs may be wasted on waiting for events
I Always good to keep thread x on same CPU (because cache
is filled)

7 / 13
OS Implications

Distributed Systems CS34 Distributed Systems

2013-05-17
OS Implications Many ways to communicate

Most important modern approach is. . .

Distributed Systems

Many ways to communicate

Most important modern approach is. . .

8 / 13
OS Implications

Distributed Systems CS34 Distributed Systems

2013-05-17
OS Implications Many ways to communicate

Most important modern approach is. . . the Internet!

Distributed Systems

Many ways to communicate

Most important modern approach is. . . the Internet!

8 / 13
OS Implications

Distributed Systems CS34 Distributed Systems

2013-05-17
OS Implications Many ways to communicate

Most important modern approach is. . . the Internet!

Communicating with skinny wires introduces new problems:

Distributed Systems I

I
Can’t move process to other machine (or must work hard)
Locking becomes really hard
I Programming multiprocessor systems is much harder

Many ways to communicate

Most important modern approach is. . . the Internet!

Communicating with skinny wires introduces new problems:


I Can’t move process to other machine (or must work hard)
I Locking becomes really hard
I Programming multiprocessor systems is much harder

8 / 13
OS Implications

Distributed Systems CS34 Distributed Systems

2013-05-17
OS Implications Many ways to communicate

Most important modern approach is. . . the Internet!

Communicating with skinny wires introduces new problems:

Distributed Systems I

I
Can’t move process to other machine (or must work hard)
Locking becomes really hard
I Programming multiprocessor systems is much harder
I . . . and what if network connection goes down?

Many ways to communicate

Most important modern approach is. . . the Internet!

Communicating with skinny wires introduces new problems:


I Can’t move process to other machine (or must work hard)
I Locking becomes really hard
I Programming multiprocessor systems is much harder
I . . . and what if network connection goes down?

8 / 13
Programming Models

RPC CS34 RPC

2013-05-17
Programming Models Programming is hard, so need abstractions that simplify things

Remote Procedure Call (RPC) makes distant system look like


normal function
1. Marshal arguments (i.e., pack up and serialize)
2. Send procedure ID and arguments to remote system
RPC 3. Wait for response
4. Deserialize return value

Class Exercise

Programming is hard, so need abstractions that simplify things What are the advantages and disadvantages?

Remote Procedure Call (RPC) makes distant system look like


normal function
1. Marshal arguments (i.e., pack up and serialize)
2. Send procedure ID and arguments to remote system
3. Wait for response
4. Deserialize return value

Class Exercise
What are the advantages and disadvantages?

9 / 13
Programming Models

DSM CS34 DSM

2013-05-17
Programming Models
RPC is nice, but limits parallelism
SMPs can do cool things because memory is shared

DSM So why not simulate shared memory across the network?

Teeny problem: hard to make it work fasta

“Hard” is a gross understatement.


RPC is nice, but limits parallelism
SMPs can do cool things because memory is shared

So why not simulate shared memory across the network?

Teeny problem: hard to make it work fasta

10 / 13
Other Issues

Load Balancing CS34 Load Balancing

2013-05-17
Other Issues Suppose you have servers A, B, C, and D
A and B are currently overloaded, C and D underloaded
A notices the situation and sends excess work to C and D
Simultaneously, B does the same! Now C and D are overloaded
Load Balancing Result can be thrashing

Common solution: have one front-end machine whose sole job is


allocating load to others

Suppose you have servers A, B, C, and D Random assignment works surprisingly well.
A and B are currently overloaded, C and D underloaded
A notices the situation and sends excess work to C and D
Simultaneously, B does the same! Now C and D are overloaded

Result can be thrashing

Common solution: have one front-end machine whose sole job is


allocating load to others

11 / 13
Other Issues

How Does Google Work? CS34 How Does Google Work?

2013-05-17
Other Issues Well, it’s a secret. . .

But basically they use the front-end approach

Obvious problem: one front end can’t handle millions of requests


How Does Google Work? per second even if it does almost nothing

Solution: DNS Round Robin tricks you into picking one of many
dozens of front ends (roughly at random) to talk to

Well, it’s a secret. . .

But basically they use the front-end approach

Obvious problem: one front end can’t handle millions of requests


per second even if it does almost nothing

Solution: DNS Round Robin tricks you into picking one of many
dozens of front ends (roughly at random) to talk to

12 / 13
Other Issues

Example of Google’s DNS tricks CS34 Example of Google’s DNS tricks

2013-05-17
These commands were run within 15 seconds of each other:

Other Issues bow:2:877> host www.google.com


www.google.com has address 74.125.224.241
www.google.com has address 74.125.224.242
www.google.com has address 74.125.224.243
www.google.com has address 74.125.224.244
www.google.com has address 74.125.224.240
Example of Google’s DNS tricks
These commands were run within 15 seconds of each other: bow:2:878> ssh
www.google.com
www.google.com
lever.cs.ucla.edu host www.google.com
has address 74.125.239.19
has address 74.125.239.20
www.google.com has address 74.125.239.17
www.google.com has address 74.125.239.18
www.google.com has address 74.125.239.16

bow:2:877> host www.google.com


www.google.com has address 74.125.224.241
www.google.com has address 74.125.224.242
www.google.com has address 74.125.224.243
www.google.com has address 74.125.224.244
www.google.com has address 74.125.224.240

bow:2:878> ssh lever.cs.ucla.edu host www.google.com


www.google.com has address 74.125.239.19
www.google.com has address 74.125.239.20
www.google.com has address 74.125.239.17
www.google.com has address 74.125.239.18
www.google.com has address 74.125.239.16

13 / 13

You might also like