0% found this document useful (0 votes)

170 views

Principles of Scalable Performance

The document discusses principles of scalable performance including performance metrics, speedup laws, scaling principles, parallelism profiles, degree of parallelism, average parallelism, available parallelism, asymptotic speedup, performance measures, redundancy, utilization, quality of parallelism, and standard performance benchmarks.

Uploaded by

Senthil Ganesh R

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

170 views

Principles of Scalable Performance

Uploaded by

Senthil Ganesh R

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Principles of Scalable Performance

• Performance measures
• Speedup laws
• Scalability principles
• Scaling up vs. scaling down

1
Performance metrics and measures

• Parallelism profiles

• Asymptotic speedup factor

• System efficiency, utilization and quality

• Standard performance measures

2
Parallelism profile in Programs

• The degree of parallelism reflects the extent to

which software parallelism matches hardware
parallelism
Degree of parallelism

• Execution of a program on a parallel computers-

use different number of processor at different time
periods during the execution cycle

• For each period –number of processor used to

execute a program – degree of parallelism (DOP)

• Discrete time function- only non negative integer

value

4
Degree of parallelism
• Parallelism profile is a plot of the DOP as a
function of time

• Ideally have unlimited resources

• Software tools –available to trace the

parallelism profile
Factors affecting parallelism profiles

• Algorithm structure

• Program optimization

• Resource utilization

• Run-time conditions

6
Degree of parallelism

• DOP-assumption – unbounded number of

available processors and other necessary
resources

• DOP not achievable on a real computer with

limited resources

• DOP exceeds maximum number of available

processor – parallel branches executed in
chunks sequentially
Degree of parallelism
• Parallelism still exists within each chunk , limited
by machine size

• Limited by memory & other non processor

resources
Average parallelism variables

• n – homogeneous processors

• m – maximum parallelism in a profile

•  - computing capacity of a single processor

(execution rate only, no overhead)

• DOP=i – # processors busy during an observation

period

9
Average parallelism

• Total amount of work performed is proportional to

the area under the profile curve
t2
W    DOP(t )dt
t1
m
W    i  ti
i 1
• ti total amount of time that DOP = I

• t2 –t1-total elapsed time

10
Average parallelism

1 t2
A 
t 2  t1 t 1
DOP (t )dt

 m
  m

A    i  ti  /   ti 
 i 1   i 1 

11
Example: parallelism profile and average
parallelism

12
Available Parallelism

• Potential parallelism in application programs

• Engineering & scientific codes exhibit a high DOP due

to data parallelism

• Computation is less –little parallelism when basic

boundaries are ignored

• Basic block- block of instructions that has single entry

and single exit points

• Complier organization & algorithm redesign –increase

available parallelism
Asymptotic speedup

m
m
T (1)   ti (1)  
Wim
T (1) W i

i 1 i 1  S   i 1
T ( ) m
m
T (  )   ti (  )  
Wim
W / i
i 1
i

i 1 i 1 i = A in the ideal case

(response time)

14
Performance measures

• Consider n processors executing m programs in

various modes with different performance levels
• Want to define the mean performance of these
multimode computers:
• Arithmetic mean performance
• Geometric mean performance
• Harmonic mean performance

15
Arithmetic mean performance

m
Ra   Ri / m Arithmetic mean execution rate
(assumes equal weighting)
i 1
m
R   ( f i Ri )
* Weighted arithmetic mean
execution rate
a
i 1
-proportional to the sum of the inverses of
execution times

16
Geometric mean performance

m
Rg   R 1/ m
i
Geometric mean execution rate

i 1
m
R   Ri
*
g
fi Weighted geometric mean
execution rate

i 1
-does not summarize the real performance since it does
not have the inverse relation with the total time

17
Harmonic mean performance

Mean execution time per instruction

Ti  1 / Ri For program i

1 m 1 m 1
Ta   Ti   Arithmetic mean execution time
per instruction
m i 1 m i 1 Ri

18
Harmonic mean performance

m
Rh  1 / Ta  m
Harmonic mean execution rate

 (1 / R )
i 1
i

1
R 
*
h m
Weighted harmonic mean execution rate

( f
i 1
i / Ri )
-corresponds to total # of operations divided by
the total time (closest to the real performance)

19
Harmonic Mean Speedup

• Ties the various modes of a program to the

number of processors used

• Program is in execution mode i, if i processors

used
1
S  T1 / T 
*


n
i 1
f i / Ri

• Sequential execution time T1 = 1/R1 = 1

20
Harmonic Mean Speedup Performance

21
Amdahl’s Law

• Assume Ri = i, w = (, 0, 0, …, 1- )
• System is either sequential, with
probability , or fully parallel with prob.
1- 
n
Sn 
1  (n  1)

• Implies S  1/  as n  
22
Speedup Performance

23
System Efficiency

• O(n) is the total # of unit operations

• T(n) is execution time in unit time steps

• T(n) < O(n) and T(1) = O(1)

𝑆 (𝑛)=𝑇 (1)/𝑇 (𝑛)
𝑆 (𝑛) 𝑇 (1)
𝐸 (𝑛)= =
𝑛 𝑛𝑇 (𝑛)

24
Redundancy and Utilization

• Redundancy signifies the extent of matching

software and hardware parallelism

R (n)  O (n) / O(1)

• Utilization indicates the percentage of resources
kept busy during execution

O ( n)
U ( n)  R ( n) E ( n) 
nT (n)

25
Quality of Parallelism

• Directly proportional to the speedup and efficiency

and inversely related to the redundancy

• Upper-bounded by the speedup S(n)

S ( n) E ( n) T 3 (1)
Q ( n)  
R ( n) nT 2 (n)O(n)

26
Example of Performance

• Given O(1) = T(1) = n3, O(n) = n3 + n2log n, and T(n) =

4 n3/(n+3)
S(n) = (n+3)/4
E(n) = (n+3)/(4n)
R(n) = (n + log n)/n
U(n) = (n+3)(n + log n)/(4n2)
Q(n) = (n+3)2 / (16(n + log n))

27
Standard Performance Measures

• MIPS and Mflops

• Describe the instruction execution rate & floating point
capability of a parallel computer
• MIPS= fx Ic/ C x 10^6

• MIPS-Depends on instruction ,performance

• Mflops – depends on machine hardware design and on

program behavior

28
Standard Performance Measures

• Dhrystone results
• CPU intensive benchmark
• Consists of 100 high level language instructions
& data types
• Balanced with respect to statement type, data
type, locality of reference , with no operating
system calls and making no use of library
functions or subroutines
• Measure of integer performance of modern
processor
Standard Performance Measures

• Whestone results
• Fortran based synthetic benchmark
• Measure of floating-point performance
• Benchmark includes both integer & floating
point operations involving array
indexing ,subroutine calls, parameter
passing ,conditional branching
Standard Performance Measures
• Performance depends on compliers used

• Dhrystone – to test CPU

• Procedure in-lining compiler technique –affect

dhrystone performance

• Sensitivity to compliers – drawback

Standard Performance Measures
• TPS and KLIPS ratings
• On line transaction processing applications
demand rapid, interactive processing for a
large number of relatively simple transaction
• Supported by very large database
• Automated teller machine & airline reservation
-examples
• Transaction performance
Standard Performance Measures

• Throughput of computers –on-line transaction

processing –transaction per second

• Transaction involve – database search , query

answering , database update operations

• In AI applications , the measure KLIPS(Kilo logic

interference per second)

• reasoning power of AI machine

Standard Performance Measures

• Japan fifth generation computer system –

performance of 400 KLIPS

• 400 KLIPS = 40 MIPS

• Logic inference demands symbolic manipulation

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (77)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
2025 MandateForLeadership FULL
70% (10)
2025 MandateForLeadership FULL
920 pages
How To Kiss A Woman's Breast
60% (114)
How To Kiss A Woman's Breast
14 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Plastic Flims For Food Packaging
No ratings yet
Plastic Flims For Food Packaging
300 pages
Shortest Common Superstring1
No ratings yet
Shortest Common Superstring1
14 pages
Lab Activity
No ratings yet
Lab Activity
8 pages
15IT313J-LabManual (Network Protocols and Programming)
No ratings yet
15IT313J-LabManual (Network Protocols and Programming)
41 pages
JOINT VENTURE AGREEMENT To Construct A Subdivision
100% (2)
JOINT VENTURE AGREEMENT To Construct A Subdivision
3 pages
Try Out 3: Soal Tryout Toefl Ppds Dokterpost
No ratings yet
Try Out 3: Soal Tryout Toefl Ppds Dokterpost
9 pages
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
No ratings yet
Module-1 Theory of Parallelism: The State of Computing Computer Development Milestones
48 pages
Introduction To Algorithms: Chapter 3: Growth of Functions
No ratings yet
Introduction To Algorithms: Chapter 3: Growth of Functions
29 pages
Real-Time Systems Lecture 1 - Introduction
No ratings yet
Real-Time Systems Lecture 1 - Introduction
47 pages
Pointers C++ Slides
No ratings yet
Pointers C++ Slides
11 pages
Msi Logic Circuit
No ratings yet
Msi Logic Circuit
17 pages
Screenshot 2024-01-18 at 7.28.06 PM
No ratings yet
Screenshot 2024-01-18 at 7.28.06 PM
19 pages
Chapter 4 (Processors and Memory Hierarchy)
100% (1)
Chapter 4 (Processors and Memory Hierarchy)
17 pages
Effect of Finite Register Length in FIR Filter Design: Dr. Parul Tyagi (Asso. Prof.) & Dr. Neha Singh (Asst. Prof.)
No ratings yet
Effect of Finite Register Length in FIR Filter Design: Dr. Parul Tyagi (Asso. Prof.) & Dr. Neha Singh (Asst. Prof.)
71 pages
Instruction Set Architecture and Design
No ratings yet
Instruction Set Architecture and Design
27 pages
Unit 3: Embedded Firmware & Hardware Design and Development
100% (1)
Unit 3: Embedded Firmware & Hardware Design and Development
22 pages
ES Notes1 (R19) IV ECE 1-2 UNITS
No ratings yet
ES Notes1 (R19) IV ECE 1-2 UNITS
66 pages
In Solution Microwave
No ratings yet
In Solution Microwave
7 pages
CP1103 Unit - 1
No ratings yet
CP1103 Unit - 1
37 pages
Delta Modulation
No ratings yet
Delta Modulation
9 pages
CS3353 Unit2
No ratings yet
CS3353 Unit2
51 pages
UNIT-V APPLICATION-LAYER
No ratings yet
UNIT-V APPLICATION-LAYER
30 pages
The Evaluation of Operating System
No ratings yet
The Evaluation of Operating System
6 pages
CD Unit 4 Compiler Design Jntuk r20
No ratings yet
CD Unit 4 Compiler Design Jntuk r20
17 pages
Computer Graphics Practical File
No ratings yet
Computer Graphics Practical File
30 pages
EC 8791 ERTS 2 Marks
100% (2)
EC 8791 ERTS 2 Marks
24 pages
Operation On Signals
No ratings yet
Operation On Signals
13 pages
Introduction To Multi-Core Architecture
No ratings yet
Introduction To Multi-Core Architecture
16 pages
GUIDELINES FOR PREPARATION OF PROJECT REPORT - III and Above
No ratings yet
GUIDELINES FOR PREPARATION OF PROJECT REPORT - III and Above
15 pages
DSP Unit 1
No ratings yet
DSP Unit 1
186 pages
Mini Project Report
No ratings yet
Mini Project Report
14 pages
Lab Manual No 03
No ratings yet
Lab Manual No 03
29 pages
Data Structures Lab Using C
No ratings yet
Data Structures Lab Using C
86 pages
String Matching
100% (1)
String Matching
12 pages
I-O Ports
No ratings yet
I-O Ports
20 pages
Digital Logic Design: Decoder & Encoder
No ratings yet
Digital Logic Design: Decoder & Encoder
20 pages
Devops Record
No ratings yet
Devops Record
109 pages
Mechatronics and Industrial Automation Question Paper
No ratings yet
Mechatronics and Industrial Automation Question Paper
1 page
C Arrays Function
No ratings yet
C Arrays Function
65 pages
Ec8552-Cao Unit 5
No ratings yet
Ec8552-Cao Unit 5
72 pages
IoT Unit 2
No ratings yet
IoT Unit 2
73 pages
Computer Organization & Architecture (KCA 105) : DR Manmohan Mishra Associate Professor MCA, Department
No ratings yet
Computer Organization & Architecture (KCA 105) : DR Manmohan Mishra Associate Professor MCA, Department
87 pages
ANN Architecture
No ratings yet
ANN Architecture
41 pages
Unit 3 Programmable Digital Signal Processors
No ratings yet
Unit 3 Programmable Digital Signal Processors
25 pages
Java Lab Manual r23 You Can Get Basic Information by Reading The Document
No ratings yet
Java Lab Manual r23 You Can Get Basic Information by Reading The Document
41 pages
BSCS PPT Daa N01
100% (1)
BSCS PPT Daa N01
38 pages
Vector Computers
No ratings yet
Vector Computers
43 pages
Tail and Non-Tail Recursion
No ratings yet
Tail and Non-Tail Recursion
14 pages
Signal Transmission Through Linear Systems
100% (3)
Signal Transmission Through Linear Systems
16 pages
Control Unit and Design
No ratings yet
Control Unit and Design
16 pages
Resource and Process Management
No ratings yet
Resource and Process Management
98 pages
E-Content Available in VTU Elearning Website (E-Notes and Lecture Videos) SL No Sub. Code Course Name
No ratings yet
E-Content Available in VTU Elearning Website (E-Notes and Lecture Videos) SL No Sub. Code Course Name
6 pages
Algorithms Flowcharts Notes
100% (4)
Algorithms Flowcharts Notes
4 pages
DLD Lab Experiment
No ratings yet
DLD Lab Experiment
1 page
Cs Unit 4
No ratings yet
Cs Unit 4
48 pages
Internal Product Attribute Measurement: Size
No ratings yet
Internal Product Attribute Measurement: Size
70 pages
Anna University Notes
No ratings yet
Anna University Notes
92 pages
Lecture 01 - Computer Hardware and Software Architectures
No ratings yet
Lecture 01 - Computer Hardware and Software Architectures
67 pages
Coa Unit3
No ratings yet
Coa Unit3
116 pages
PPS Course Material
100% (1)
PPS Course Material
177 pages
Professional Ethics Notes KTU
No ratings yet
Professional Ethics Notes KTU
6 pages
ATmega 8
No ratings yet
ATmega 8
96 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
List of Airports in The United Kingdom ... Ies - Wikipedia, The Free Encyclopedia
No ratings yet
List of Airports in The United Kingdom ... Ies - Wikipedia, The Free Encyclopedia
10 pages
Final Paper
No ratings yet
Final Paper
4 pages
Elephant Toothpaste
No ratings yet
Elephant Toothpaste
3 pages
Pico Application Form
No ratings yet
Pico Application Form
4 pages
Volume 47, Issue 53, December 30, 2016
No ratings yet
Volume 47, Issue 53, December 30, 2016
40 pages
Personal Philosophy Reflection Paper 1
100% (1)
Personal Philosophy Reflection Paper 1
5 pages
Case Study-Mis Failure Abstract:: Keywords
No ratings yet
Case Study-Mis Failure Abstract:: Keywords
2 pages
Active Dan Passive Voice
No ratings yet
Active Dan Passive Voice
5 pages
As
No ratings yet
As
1,347 pages
Lateral Support Systems and Underpinning Construction Methods
100% (1)
Lateral Support Systems and Underpinning Construction Methods
479 pages
TATA DOCOMO Customer Satisfaction
100% (1)
TATA DOCOMO Customer Satisfaction
43 pages
The Society For The Promotion of Hellenic Studies Is Collaborating With JSTOR To Digitize, Preserve and Extend
No ratings yet
The Society For The Promotion of Hellenic Studies Is Collaborating With JSTOR To Digitize, Preserve and Extend
8 pages
Frasers Group PDP Editable Version
No ratings yet
Frasers Group PDP Editable Version
1 page
Surah Al Fatiha: Al Fatiha Literally Means The Opening Derived From The Root Letters Root English Arabi C Meaning
100% (2)
Surah Al Fatiha: Al Fatiha Literally Means The Opening Derived From The Root Letters Root English Arabi C Meaning
9 pages
Mohammad Mursalin
No ratings yet
Mohammad Mursalin
3 pages
SS3Q 2009 Lesson 08
No ratings yet
SS3Q 2009 Lesson 08
10 pages
Lecture 2 - Introduction To Mine Management
No ratings yet
Lecture 2 - Introduction To Mine Management
32 pages
Grade 8 Drama Note Pack for July Aug
No ratings yet
Grade 8 Drama Note Pack for July Aug
6 pages
Contemporary Advertising and Integrated Marketing Communications 15th Edition by Arens ISBN Solution Manual
100% (43)
Contemporary Advertising and Integrated Marketing Communications 15th Edition by Arens ISBN Solution Manual
33 pages
Zondervan_Final_Instructor_Manual-Copy
No ratings yet
Zondervan_Final_Instructor_Manual-Copy
126 pages
Setsi Math
No ratings yet
Setsi Math
9 pages
Plastics Guide
100% (6)
Plastics Guide
4 pages
A Level Business Studies Made Easy (Repaired) (Repaired) 1
No ratings yet
A Level Business Studies Made Easy (Repaired) (Repaired) 1
225 pages
Project Report-BSNL Multiplay
No ratings yet
Project Report-BSNL Multiplay
11 pages
Superacid - Wikipedia, The Free Encyclopedia
No ratings yet
Superacid - Wikipedia, The Free Encyclopedia
3 pages
Hock Lai Ho, Evidence and Truth
No ratings yet
Hock Lai Ho, Evidence and Truth
16 pages
A Level English Coursework Bibliography
100% (2)
A Level English Coursework Bibliography
6 pages

Principles of Scalable Performance

Uploaded by

Principles of Scalable Performance

Uploaded by

Principles of Scalable Performance

• Asymptotic speedup factor

• System efficiency, utilization and quality

• Standard performance measures

• The degree of parallelism reflects the extent to

• Execution of a program on a parallel computers-

• For each period –number of processor used to

• Discrete time function- only non negative integer

• Ideally have unlimited resources

• Software tools –available to trace the

• DOP-assumption – unbounded number of

• DOP not achievable on a real computer with

• DOP exceeds maximum number of available

• Limited by memory & other non processor

• m – maximum parallelism in a profile

•  - computing capacity of a single processor

• DOP=i – # processors busy during an observation

• Total amount of work performed is proportional to

• t2 –t1-total elapsed time

• Potential parallelism in application programs

• Engineering & scientific codes exhibit a high DOP due

• Computation is less –little parallelism when basic

• Basic block- block of instructions that has single entry

• Complier organization & algorithm redesign –increase

i 1 i 1 i = A in the ideal case

• Consider n processors executing m programs in

Mean execution time per instruction

• Ties the various modes of a program to the

• Program is in execution mode i, if i processors

• Sequential execution time T1 = 1/R1 = 1

• O(n) is the total # of unit operations

• T(n) is execution time in unit time steps

• T(n) < O(n) and T(1) = O(1)

• Redundancy signifies the extent of matching

R (n)  O (n) / O(1)

• Directly proportional to the speedup and efficiency

• Upper-bounded by the speedup S(n)

• Given O(1) = T(1) = n3, O(n) = n3 + n2log n, and T(n) =

• MIPS and Mflops

• MIPS-Depends on instruction ,performance

• Mflops – depends on machine hardware design and on

• Dhrystone – to test CPU

• Procedure in-lining compiler technique –affect

• Sensitivity to compliers – drawback

• Throughput of computers –on-line transaction

• Transaction involve – database search , query

• In AI applications , the measure KLIPS(Kilo logic

• reasoning power of AI machine

• Japan fifth generation computer system –

• 400 KLIPS = 40 MIPS

• Logic inference demands symbolic manipulation

You might also like