SlideShare a Scribd company logo
Tuning Python Applications Can
Dramatically Increase Performance
Vasilij Litvinov
Software Engineer, Intel
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
2
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL
OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL
ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY,
RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A
PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER
INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel
microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results to
vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products.
Copyright © 2016, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel
logo are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are
not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other
optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on
microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for
use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel
microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the
specific instruction sets covered by this notice.
Notice revision #20110804
2
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
3
Why do we need Python optimization?
How one finds the code to tune?
Overview of existing tools
An example
Intel® VTune™ Amplifier capabilities and comparison
Q & A
Agenda
3
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
4
• Python is used to power wide range of software, including
those where application performance matters
• Some Python code may not scale well, but you won’t know it
unless you give it enough workload to chew on
• Sometimes you are just not happy with the speed of your code
All in all, there are times when you want to make your code run
faster, be more responsive, (insert your favorite buzzword here).
So, you need to optimize (or tune) your code.
4
Why do we need Python optimization?
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
5
• Hard stare = Often wrong
5
How one finds the code to tune –
measuring vs guessing
• Profile = Accurate, Easy
• Logging = Analysis is tedious
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
6
There are different profiling techniques, for example:
• Event-based
• Example: built-in Python cProfile profiler
• Instrumentation-based
• Usually requires modifying the target application
(source code, compiler support, etc.)
• Example: line_profiler
• Statistical
• Accurate enough & less intrusive
• Example: vmstat, statprof
6
Not All Profilers Are Equal
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
7
Tool Description Platforms Profile
level
Avg. overhead
cProfile
(built-in)
• Text interactive
mode: “pstats” (built-
in)
• GUI viewer:
RunSnakeRun
• Open Source
Any Function 1.3x-5x
Python
Tools
• Visual Studio (2010+)
• Open Source
Windows Function ~2x
PyCharm • Not free
• cProfile/yappi based
Any Function 1.3x-5x (same
as cProfile)
line_profiler • Pure Python
• Open Source
• Text-only viewer
Any Line Up to
10x or more
Most Profilers – High Overhead, No Line Info
7
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
8
Example performance hogs
Task Slow way Faster way
Concatenate
a list
s = ''
for ch in some_lst:
s += ch
s = ''.join(some_lst)
Reason: concatenating requires re-allocating and copying memory as strings
are immutable
Remove
some value
from a list
while some_value in lst:
lst.remove(some_value)
while True:
try:
lst.remove(some_value)
except ValueError:
break
Reason: both in and .remove() have complexity O(n), and slower version
searches the list twice for each removal, so it’s about twice as slow
reason
reason
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
9
class Encoder:
CHAR_MAP = {'a': 'b', 'b': 'c'}
def __init__(self, input):
self.input = input
def process_slow(self):
result = ''
for ch in self.input:
result += self.CHAR_MAP.get(ch, ch)
return result
def process_fast(self):
result = []
for ch in self.input:
result.append(self.CHAR_MAP.get(ch, ch))
return ''.join(result)
9
Python example to profile: demo.py
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
10
import demo
import time
def slow_encode(input):
return demo.Encoder(input).process_slow()
def fast_encode(input):
return demo.Encoder(input).process_fast()
if __name__ == '__main__':
input = 'a' * 10000000 # 10 millions of 'a'
start = time.time()
s1 = slow_encode(input)
slow_stop = time.time()
print 'slow: %.2f sec' % (slow_stop - start)
s2 = fast_encode(input)
print 'fast: %.2f sec' % (time.time() - slow_stop)
10
Python sample to profile: run.py
No profiling overhead - a
baseline for tools’ overhead
comparison
slow: 9.15 sec = 1.00x
fast: 3.16 sec = 1.00x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
11
> python -m cProfile -o run.prof run.py
> python -m pstats run.prof
run.prof% sort time
run.prof% stats
Tue Jun 30 18:43:53 2015 run.prof
30000014 function calls in 15.617 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 9.597 9.597 10.268 10.268 demo.py:6(process_slow)
1 3.850 3.850 5.302 5.302 demo.py:12(process_fast)
20000000 1.267 0.000 1.267 0.000 {method 'get' of 'dict' objects}
10000000 0.790 0.000 0.790 0.000 {method 'append' of 'list' objects}
1 0.066 0.066 0.066 0.066 {method 'join' of 'str' objects}
1 0.038 0.038 5.340 5.340 run.py:7(fast_encode)
1 0.009 0.009 15.617 15.617 run.py:1(<module>)
1 0.000 0.000 10.268 10.268 run.py:4(slow_encode)
cProfile + pstats UI example
11
slow: 10.27 sec = 1.12x
fast: 5.34 sec = 1.69x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
12
cProfile + RunSnakeRun
12
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
13
13
cProfile in PyCharm
slow: 10.07 sec = 1.10x
fast: 5.60 sec = 1.77x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
14
Total time: 18.095 s
File: demo_lp.py
Function: process_slow at line 6
Line # Hits Time Per Hit % Time Line Contents
==========================================================
6 @profile
7 def process_slow(self):
8 1 14 14.0 0.0 result = ''
9 10000001 10260548 1.0 23.3 for ch in self.input:
10 10000000 33814644 3.4 76.7 result += self.CHAR_MAP.get(...
11 1 4 4.0 0.0 return result
Total time: 16.8512 s
File: demo_lp.py
Function: process_fast at line 13
Line # Hits Time Per Hit % Time Line Contents
==========================================================
13 @profile
14 def process_fast(self):
15 1 7 7.0 0.0 result = []
16 10000001 13684785 1.4 33.3 for ch in self.input:
17 10000000 27048146 2.7 65.9 result.append(self.CHAR_MAP.get(...
18 1 312611 312611.0 0.8 return ''.join(result)
line_profiler results
14
slow: 24.32 sec = 2.66x
fast: 25.37 sec = 8.03x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
15
Python Tools GUI example
15
Note:
Wallclock time (not CPU)
slow: 17.40 sec = 1.90x
fast: 12.08 sec = 3.82x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
16
Intel® VTune™ Amplifier example
16
slow: 10.85 sec = 1.19x
fast: 3.30 sec = 1.05x
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
17
Intel® VTune™ Amplifier – source view
17
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
18
Line-level profiling details:
 Uses sampling profiling technique
 Average overhead ~1.1x-1.6x (on certain benchmarks)
Cross-platform:
 Windows and Linux
 Python 32- and 64-bit; 2.7.x, 3.4.x, 3.5.0 versions
Rich Graphical UI
Supported workflows:
 Start application, wait for it to finish
 Attach to application, profile for a bit, detach
Intel® VTune™ Amplifier: Accurate &
Easy
18
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
19
Low Overhead and Line-Level Info
Tool Description Platforms Profile
level
Avg. overhead
Intel®
VTune™
Amplifier
• Rich GUI viewer Windows
Linux
Line ~1.1-1.6x
cProfile
(built-in)
• Text interactive
mode: “pstats”
(built-in)
• GUI viewer:
RunSnakeRun
(Open Source)
• PyCharm
Any Function 1.3x-5x
Python Tools • Visual Studio
(2010+)
• Open Source
Windows Function ~2x
line_profiler • Pure Python
• Open Source
• Text-only viewer
Any Line Up to
10x or more
19
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
20
• One widely-used web page in our internally set up Buildbot
service: 3x speed up (from 90 seconds to 28)
• Report generator – from 350 sec to <2 sec for 1MB log file
• Distilled version was the base for demo.py
• Internal SCons-based build system: several places sped up 2x
or more
• Loading all configs from scratch tuned from 6 minutes to 3 minutes
We’ve Had Success Tuning Our Python Code
20
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
21
• Technical Preview & Beta Participation – email us at
scripting@intel.com
• We’re also working on Intel-accelerated Python (e.g. NumPy/SciPy, etc.),
which is currently in Tech Preview. Sign up!
• Check out Intel® Developer Zone – software.intel.com
• Check out Intel® Software Development tools
• Qualify for Free Intel® Software Development tools
21
Sign Up with Us to Give the Profiler a
Try & Check out Intel® Software
Development Tools
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
22
Free Intel® Software Development Tools
Intel® Performance Libraries for academic research
Visit us at https://software.intel.com/en-us/qualify-for-free-software
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
… and again:
• For Tech Preview and Beta, drop us an email at
scripting@intel.com
• Check out free Intel® software – just google for
“free intel tools” to see if you’re qualified
23
Q & A
Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice
24
Performance Starts Here!
24
You are a: Products Available++ Support
Model
Price
Commercial
Developer or
Academic
Researcher
Intel® Parallel Studio XE
(Compilers, Performance Libraries &
Analyzers)
Intel® Premier
Support
$699** -
$2949**
Academic
Researcher+
Intel® Performance Libraries
Intel® Math Kernel Library
Intel® MPI Library
Intel® Threading Building Blocks
Intel® Integrated Performance Primitives
Forum only
support
Free!Student+
Intel® Parallel Studio XE Cluster Edition
Educator+
Open Source
Contributor+
Intel® Parallel Studio XE Professional
Edition
+Subject to qualification ++OS Support varies by product **Single Seat Pricing
Vasiliy Litvinov - Python Profiling
Ad

More Related Content

What's hot (19)

Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Intel® Software
 
Developing new zynq based instruments
Developing new zynq based instrumentsDeveloping new zynq based instruments
Developing new zynq based instruments
Graham NAYLOR
 
5 pipeline arch_rationale
5 pipeline arch_rationale5 pipeline arch_rationale
5 pipeline arch_rationale
videos
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3
Slide_N
 
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
Me3D: A Model-driven Methodology  Expediting Embedded Device  Driver DevelopmentMe3D: A Model-driven Methodology  Expediting Embedded Device  Driver Development
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
huichenphd
 
FPGA Camp - Intellitech Presentation
FPGA Camp - Intellitech PresentationFPGA Camp - Intellitech Presentation
FPGA Camp - Intellitech Presentation
FPGA Central
 
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profileLinux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Satish Kumar
 
SequenceL intro slideshare
SequenceL intro slideshareSequenceL intro slideshare
SequenceL intro slideshare
Doug Norton
 
Revers engineering
Revers engineeringRevers engineering
Revers engineering
AbdusSalam ALJBRI
 
Overview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path ArchitectureOverview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path Architecture
Intel® Software
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
Koan-Sin Tan
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...
AgilentT&amp;M EMEA
 
Automated hardware testing using python
Automated hardware testing using pythonAutomated hardware testing using python
Automated hardware testing using python
Yuvaraja Ravi
 
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
Umang Parekh
 
Perceptual Computing Workshop à Paris
Perceptual Computing Workshop à ParisPerceptual Computing Workshop à Paris
Perceptual Computing Workshop à Paris
BeMyApp
 
Chip ex 2011 faraday
Chip ex 2011 faradayChip ex 2011 faraday
Chip ex 2011 faraday
chiportal
 
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
EclipseCon 2011: Deciphering the CDT debugger alphabet soupEclipseCon 2011: Deciphering the CDT debugger alphabet soup
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
Bruce Griffith
 
Perceptual Computing Workshop in Munich
Perceptual Computing Workshop in MunichPerceptual Computing Workshop in Munich
Perceptual Computing Workshop in Munich
BeMyApp
 
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Embarcadero Technologies
 
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Ray Tracing with Intel® Embree and Intel® OSPRay: Use Cases and Updates | SIG...
Intel® Software
 
Developing new zynq based instruments
Developing new zynq based instrumentsDeveloping new zynq based instruments
Developing new zynq based instruments
Graham NAYLOR
 
5 pipeline arch_rationale
5 pipeline arch_rationale5 pipeline arch_rationale
5 pipeline arch_rationale
videos
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3
Slide_N
 
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
Me3D: A Model-driven Methodology  Expediting Embedded Device  Driver DevelopmentMe3D: A Model-driven Methodology  Expediting Embedded Device  Driver Development
Me3D: A Model-driven Methodology Expediting Embedded Device Driver Development
huichenphd
 
FPGA Camp - Intellitech Presentation
FPGA Camp - Intellitech PresentationFPGA Camp - Intellitech Presentation
FPGA Camp - Intellitech Presentation
FPGA Central
 
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profileLinux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Linux Kernel , BSP, Boot Loader, ARM Engineer - Satish profile
Satish Kumar
 
SequenceL intro slideshare
SequenceL intro slideshareSequenceL intro slideshare
SequenceL intro slideshare
Doug Norton
 
Overview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path ArchitectureOverview of Intel® Omni-Path Architecture
Overview of Intel® Omni-Path Architecture
Intel® Software
 
TFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU DelegatesTFLite NNAPI and GPU Delegates
TFLite NNAPI and GPU Delegates
Koan-Sin Tan
 
Agilent flash programming agilent utility card versus deep serial memory-ca...
Agilent flash programming   agilent utility card versus deep serial memory-ca...Agilent flash programming   agilent utility card versus deep serial memory-ca...
Agilent flash programming agilent utility card versus deep serial memory-ca...
AgilentT&amp;M EMEA
 
Automated hardware testing using python
Automated hardware testing using pythonAutomated hardware testing using python
Automated hardware testing using python
Yuvaraja Ravi
 
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
Umang Parekh
 
Perceptual Computing Workshop à Paris
Perceptual Computing Workshop à ParisPerceptual Computing Workshop à Paris
Perceptual Computing Workshop à Paris
BeMyApp
 
Chip ex 2011 faraday
Chip ex 2011 faradayChip ex 2011 faraday
Chip ex 2011 faraday
chiportal
 
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
EclipseCon 2011: Deciphering the CDT debugger alphabet soupEclipseCon 2011: Deciphering the CDT debugger alphabet soup
EclipseCon 2011: Deciphering the CDT debugger alphabet soup
Bruce Griffith
 
Perceptual Computing Workshop in Munich
Perceptual Computing Workshop in MunichPerceptual Computing Workshop in Munich
Perceptual Computing Workshop in Munich
BeMyApp
 
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Embarcadero Technologies
 

Viewers also liked (20)

What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
Piotr Przymus
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian Ozsvald
PyData
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
晨揚 施
 
Python profiling
Python profilingPython profiling
Python profiling
dreampuf
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
Kazuaki Ishizaki
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud Ibrahimov
Maksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
Spark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
Jon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
sparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Spark Summit
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Spark Summit
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Spark Summit
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
Spark Summit
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
Ahsan Javed Awan
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
Wes McKinney
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
Spark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Spark Summit
 
Python Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The GloryPython Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The Glory
emptysquare
 
What’s eating python performance
What’s eating python performanceWhat’s eating python performance
What’s eating python performance
Piotr Przymus
 
The High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian OzsvaldThe High Performance Python Landscape by Ian Ozsvald
The High Performance Python Landscape by Ian Ozsvald
PyData
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
Spark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance TuningSpark + Scikit Learn- Performance Tuning
Spark + Scikit Learn- Performance Tuning
晨揚 施
 
Python profiling
Python profilingPython profiling
Python profiling
dreampuf
 
Spark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud IbrahimovSpark performance tuning - Maksud Ibrahimov
Spark performance tuning - Maksud Ibrahimov
Maksud Ibrahimov
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
Spark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
Jon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
sparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Spark Summit
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Spark Summit
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Spark Summit
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
Spark Summit
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
Ahsan Javed Awan
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
Wes McKinney
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
Spark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Spark Summit
 
Python Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The GloryPython Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The Glory
emptysquare
 
Ad

Similar to Vasiliy Litvinov - Python Profiling (20)

Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
PyData
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
AWS User Group Bengaluru
 
NFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function FrameworkNFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function Framework
Michelle Holley
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
First Steps in Python Programming
First Steps in Python ProgrammingFirst Steps in Python Programming
First Steps in Python Programming
Dozie Agbo
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
Intel Software Brasil
 
Core python programming tutorial
Core python programming tutorialCore python programming tutorial
Core python programming tutorial
Amarjeetsingh Thakur
 
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
tdc-globalcode
 
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Igor José F. Freitas
 
OPENMP ANALYSIS IN VTUNE AMPLIFIER XE
OPENMP ANALYSIS IN VTUNE AMPLIFIER XEOPENMP ANALYSIS IN VTUNE AMPLIFIER XE
OPENMP ANALYSIS IN VTUNE AMPLIFIER XE
DESMOND YUEN
 
Intro To Spring Python
Intro To Spring PythonIntro To Spring Python
Intro To Spring Python
gturnquist
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
MAKERPRO.cc
 
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel Software Brasil
 
Kscope presentation 2013
Kscope presentation 2013Kscope presentation 2013
Kscope presentation 2013
Prescient Solutions
 
HFM API Deep Dive – Making a Better Financial Management Client
HFM API Deep Dive – Making a Better Financial Management ClientHFM API Deep Dive – Making a Better Financial Management Client
HFM API Deep Dive – Making a Better Financial Management Client
Charles Beyer
 
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Intel® Software
 
Supporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with SplunkSupporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with Splunk
Erin Sweeney
 
Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0Benchmarking PyCon AU 2011 v0
Benchmarking PyCon AU 2011 v0
Tennessee Leeuwenburg
 
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Accelerate Your Python* Code through Profiling, Tuning, and Compilation Part ...
Intel® Software
 
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike MullerFaster Python Programs Through Optimization by Dr.-Ing Mike Muller
Faster Python Programs Through Optimization by Dr.-Ing Mike Muller
PyData
 
Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018Ready access to high performance Python with Intel Distribution for Python 2018
Ready access to high performance Python with Intel Distribution for Python 2018
AWS User Group Bengaluru
 
NFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function FrameworkNFF-GO (YANFF) - Yet Another Network Function Framework
NFF-GO (YANFF) - Yet Another Network Function Framework
Michelle Holley
 
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
TDC2019 Intel Software Day - Tecnicas de Programacao Paralela em Machine Lear...
tdc-globalcode
 
First Steps in Python Programming
First Steps in Python ProgrammingFirst Steps in Python Programming
First Steps in Python Programming
Dozie Agbo
 
Intel Technologies for High Performance Computing
Intel Technologies for High Performance ComputingIntel Technologies for High Performance Computing
Intel Technologies for High Performance Computing
Intel Software Brasil
 
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
TDC2017 | São Paulo - Trilha Machine Learning How we figured out we had a SRE...
tdc-globalcode
 
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Tendências da junção entre Big Data Analytics, Machine Learning e Supercomput...
Igor José F. Freitas
 
OPENMP ANALYSIS IN VTUNE AMPLIFIER XE
OPENMP ANALYSIS IN VTUNE AMPLIFIER XEOPENMP ANALYSIS IN VTUNE AMPLIFIER XE
OPENMP ANALYSIS IN VTUNE AMPLIFIER XE
DESMOND YUEN
 
Intro To Spring Python
Intro To Spring PythonIntro To Spring Python
Intro To Spring Python
gturnquist
 
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
【視覺進化論】AI智慧視覺運算技術論壇_2_ChungYeh
MAKERPRO.cc
 
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel® VTune™ Amplifier - Intel Software Conference 2013
Intel Software Brasil
 
HFM API Deep Dive – Making a Better Financial Management Client
HFM API Deep Dive – Making a Better Financial Management ClientHFM API Deep Dive – Making a Better Financial Management Client
HFM API Deep Dive – Making a Better Financial Management Client
Charles Beyer
 
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Tuning For Deep Learning Inference with Intel® Processor Graphics | SIGGRAPH ...
Intel® Software
 
Supporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with SplunkSupporting Enterprise System Rollouts with Splunk
Supporting Enterprise System Rollouts with Splunk
Erin Sweeney
 
Ad

Recently uploaded (20)

Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Journal of Soft Computing in Civil Engineering
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Oil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdfOil-gas_Unconventional oil and gass_reseviours.pdf
Oil-gas_Unconventional oil and gass_reseviours.pdf
M7md3li2
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
Avnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights FlyerAvnet Silica's PCIM 2025 Highlights Flyer
Avnet Silica's PCIM 2025 Highlights Flyer
WillDavies22
 

Vasiliy Litvinov - Python Profiling

  • 1. Tuning Python Applications Can Dramatically Increase Performance Vasilij Litvinov Software Engineer, Intel
  • 2. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 2 Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © 2016, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 2
  • 3. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 3 Why do we need Python optimization? How one finds the code to tune? Overview of existing tools An example Intel® VTune™ Amplifier capabilities and comparison Q & A Agenda 3
  • 4. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 4 • Python is used to power wide range of software, including those where application performance matters • Some Python code may not scale well, but you won’t know it unless you give it enough workload to chew on • Sometimes you are just not happy with the speed of your code All in all, there are times when you want to make your code run faster, be more responsive, (insert your favorite buzzword here). So, you need to optimize (or tune) your code. 4 Why do we need Python optimization?
  • 5. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 5 • Hard stare = Often wrong 5 How one finds the code to tune – measuring vs guessing • Profile = Accurate, Easy • Logging = Analysis is tedious
  • 6. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 6 There are different profiling techniques, for example: • Event-based • Example: built-in Python cProfile profiler • Instrumentation-based • Usually requires modifying the target application (source code, compiler support, etc.) • Example: line_profiler • Statistical • Accurate enough & less intrusive • Example: vmstat, statprof 6 Not All Profilers Are Equal
  • 7. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 7 Tool Description Platforms Profile level Avg. overhead cProfile (built-in) • Text interactive mode: “pstats” (built- in) • GUI viewer: RunSnakeRun • Open Source Any Function 1.3x-5x Python Tools • Visual Studio (2010+) • Open Source Windows Function ~2x PyCharm • Not free • cProfile/yappi based Any Function 1.3x-5x (same as cProfile) line_profiler • Pure Python • Open Source • Text-only viewer Any Line Up to 10x or more Most Profilers – High Overhead, No Line Info 7
  • 8. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 8 Example performance hogs Task Slow way Faster way Concatenate a list s = '' for ch in some_lst: s += ch s = ''.join(some_lst) Reason: concatenating requires re-allocating and copying memory as strings are immutable Remove some value from a list while some_value in lst: lst.remove(some_value) while True: try: lst.remove(some_value) except ValueError: break Reason: both in and .remove() have complexity O(n), and slower version searches the list twice for each removal, so it’s about twice as slow reason reason
  • 9. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 9 class Encoder: CHAR_MAP = {'a': 'b', 'b': 'c'} def __init__(self, input): self.input = input def process_slow(self): result = '' for ch in self.input: result += self.CHAR_MAP.get(ch, ch) return result def process_fast(self): result = [] for ch in self.input: result.append(self.CHAR_MAP.get(ch, ch)) return ''.join(result) 9 Python example to profile: demo.py
  • 10. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 10 import demo import time def slow_encode(input): return demo.Encoder(input).process_slow() def fast_encode(input): return demo.Encoder(input).process_fast() if __name__ == '__main__': input = 'a' * 10000000 # 10 millions of 'a' start = time.time() s1 = slow_encode(input) slow_stop = time.time() print 'slow: %.2f sec' % (slow_stop - start) s2 = fast_encode(input) print 'fast: %.2f sec' % (time.time() - slow_stop) 10 Python sample to profile: run.py No profiling overhead - a baseline for tools’ overhead comparison slow: 9.15 sec = 1.00x fast: 3.16 sec = 1.00x
  • 11. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 11 > python -m cProfile -o run.prof run.py > python -m pstats run.prof run.prof% sort time run.prof% stats Tue Jun 30 18:43:53 2015 run.prof 30000014 function calls in 15.617 seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 9.597 9.597 10.268 10.268 demo.py:6(process_slow) 1 3.850 3.850 5.302 5.302 demo.py:12(process_fast) 20000000 1.267 0.000 1.267 0.000 {method 'get' of 'dict' objects} 10000000 0.790 0.000 0.790 0.000 {method 'append' of 'list' objects} 1 0.066 0.066 0.066 0.066 {method 'join' of 'str' objects} 1 0.038 0.038 5.340 5.340 run.py:7(fast_encode) 1 0.009 0.009 15.617 15.617 run.py:1(<module>) 1 0.000 0.000 10.268 10.268 run.py:4(slow_encode) cProfile + pstats UI example 11 slow: 10.27 sec = 1.12x fast: 5.34 sec = 1.69x
  • 12. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 12 cProfile + RunSnakeRun 12
  • 13. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 13 13 cProfile in PyCharm slow: 10.07 sec = 1.10x fast: 5.60 sec = 1.77x
  • 14. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 14 Total time: 18.095 s File: demo_lp.py Function: process_slow at line 6 Line # Hits Time Per Hit % Time Line Contents ========================================================== 6 @profile 7 def process_slow(self): 8 1 14 14.0 0.0 result = '' 9 10000001 10260548 1.0 23.3 for ch in self.input: 10 10000000 33814644 3.4 76.7 result += self.CHAR_MAP.get(... 11 1 4 4.0 0.0 return result Total time: 16.8512 s File: demo_lp.py Function: process_fast at line 13 Line # Hits Time Per Hit % Time Line Contents ========================================================== 13 @profile 14 def process_fast(self): 15 1 7 7.0 0.0 result = [] 16 10000001 13684785 1.4 33.3 for ch in self.input: 17 10000000 27048146 2.7 65.9 result.append(self.CHAR_MAP.get(... 18 1 312611 312611.0 0.8 return ''.join(result) line_profiler results 14 slow: 24.32 sec = 2.66x fast: 25.37 sec = 8.03x
  • 15. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 15 Python Tools GUI example 15 Note: Wallclock time (not CPU) slow: 17.40 sec = 1.90x fast: 12.08 sec = 3.82x
  • 16. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 16 Intel® VTune™ Amplifier example 16 slow: 10.85 sec = 1.19x fast: 3.30 sec = 1.05x
  • 17. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 17 Intel® VTune™ Amplifier – source view 17
  • 18. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 18 Line-level profiling details:  Uses sampling profiling technique  Average overhead ~1.1x-1.6x (on certain benchmarks) Cross-platform:  Windows and Linux  Python 32- and 64-bit; 2.7.x, 3.4.x, 3.5.0 versions Rich Graphical UI Supported workflows:  Start application, wait for it to finish  Attach to application, profile for a bit, detach Intel® VTune™ Amplifier: Accurate & Easy 18
  • 19. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 19 Low Overhead and Line-Level Info Tool Description Platforms Profile level Avg. overhead Intel® VTune™ Amplifier • Rich GUI viewer Windows Linux Line ~1.1-1.6x cProfile (built-in) • Text interactive mode: “pstats” (built-in) • GUI viewer: RunSnakeRun (Open Source) • PyCharm Any Function 1.3x-5x Python Tools • Visual Studio (2010+) • Open Source Windows Function ~2x line_profiler • Pure Python • Open Source • Text-only viewer Any Line Up to 10x or more 19
  • 20. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 20 • One widely-used web page in our internally set up Buildbot service: 3x speed up (from 90 seconds to 28) • Report generator – from 350 sec to <2 sec for 1MB log file • Distilled version was the base for demo.py • Internal SCons-based build system: several places sped up 2x or more • Loading all configs from scratch tuned from 6 minutes to 3 minutes We’ve Had Success Tuning Our Python Code 20
  • 21. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 21 • Technical Preview & Beta Participation – email us at [email protected] • We’re also working on Intel-accelerated Python (e.g. NumPy/SciPy, etc.), which is currently in Tech Preview. Sign up! • Check out Intel® Developer Zone – software.intel.com • Check out Intel® Software Development tools • Qualify for Free Intel® Software Development tools 21 Sign Up with Us to Give the Profiler a Try & Check out Intel® Software Development Tools
  • 22. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 22 Free Intel® Software Development Tools Intel® Performance Libraries for academic research Visit us at https://software.intel.com/en-us/qualify-for-free-software
  • 23. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice … and again: • For Tech Preview and Beta, drop us an email at [email protected] • Check out free Intel® software – just google for “free intel tools” to see if you’re qualified 23 Q & A
  • 24. Copyright © 2016, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 24 Performance Starts Here! 24 You are a: Products Available++ Support Model Price Commercial Developer or Academic Researcher Intel® Parallel Studio XE (Compilers, Performance Libraries & Analyzers) Intel® Premier Support $699** - $2949** Academic Researcher+ Intel® Performance Libraries Intel® Math Kernel Library Intel® MPI Library Intel® Threading Building Blocks Intel® Integrated Performance Primitives Forum only support Free!Student+ Intel® Parallel Studio XE Cluster Edition Educator+ Open Source Contributor+ Intel® Parallel Studio XE Professional Edition +Subject to qualification ++OS Support varies by product **Single Seat Pricing