SlideShare a Scribd company logo
www.morconsulting.c
The High Performance Python
Landscape
- profiling and fast calculation
Ian Ozsvald @IanOzsvald MorConsulting.com
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
What is “high performance”?
●
Profiling to understand system behaviour
●
We often ignore this step...
●
Speeding up the bottleneck
●
Keeps you on 1 machine (if possible)
●
Keeping team speed high
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
“High Performance Python”
• “Practical Performant
Programming
for Humans”
• Please join the mailing
list via IanOzsvald.com
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
cProfile
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
line_profiler
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     9                                           @profile
    10                                           def calculate_z_serial_purepython(
                                                      maxiter, zs, cs):
    12         1         6870   6870.0      0.0      output = [0] * len(zs)
    13   1000001       781959      0.8      0.8      for i in range(len(zs)):
    14   1000000       767224      0.8      0.8          n = 0
    15   1000000       843432      0.8      0.8          z = zs[i]
    16   1000000       786013      0.8      0.8          c = cs[i]
    17  34219980     36492596      1.1     36.2          while abs(z) < 2 
                                                               and n < maxiter:
    18  33219980     32869046      1.0     32.6              z = z * z + c
    19  33219980     27371730      0.8     27.2              n += 1
    20   1000000       890837      0.9      0.9          output[i] = n
    21         1            4      4.0      0.0      return output
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
memory_profiler
Line #    Mem usage    Increment   Line Contents
================================================
     9   89.934 MiB    0.000 MiB   @profile
    10                             def calculate_z_serial_purepython(
                                                     maxiter, zs, cs):            
                     
    12   97.566 MiB    7.633 MiB       output = [0] * len(zs)
    13  130.215 MiB   32.648 MiB       for i in range(len(zs)):
    14  130.215 MiB    0.000 MiB           n = 0
    15  130.215 MiB    0.000 MiB           z = zs[i]
    16  130.215 MiB    0.000 MiB           c = cs[i]
    17  130.215 MiB    0.000 MiB           while n < maxiter and abs(z) < 2:
    18  130.215 MiB    0.000 MiB               z = z * z + c
    19  130.215 MiB    0.000 MiB               n += 1
    20  130.215 MiB    0.000 MiB           output[i] = n
    21  122.582 MiB   ­7.633 MiB       return output
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
memory_profiler mprof
https://github.com/scikit-learn/scikit-l
earn/pull/2248
Before & After an improvement
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Transforming memory_profiler
into a resource profiler?
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Profiling possibilities
●
CPU (line by line or by function)
●
Memory (line by line)
●
Disk read/write (with some hacking)
●
Network read/write (with some hacking)
●
mmaps
●
File handles
●
Network connections
●
Cache utilisation via libperf?
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Cython 0.20 (pyx annotations)
#cython: boundscheck=False
def calculate_z(int maxiter, zs, cs):
    """Calculate output list using Julia update rule"""
    cdef unsigned int i, n
    cdef double complex z, c
    output = [0] * len(zs)
    for i in range(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        while n < maxiter and (z.real * z.real + z.imag * z.imag) < 4:
            z = z * z + c
            n += 1
        output[i] = n
    return output
Pure CPython lists code 12s
Cython lists runtime 0.19s
Cython numpy runtime 0.16s
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Cython + numpy + OMP nogil
#cython: boundscheck=False
from cython.parallel import parallel, prange
import numpy as np
cimport numpy as np
def calculate_z(int maxiter, double complex[:] zs, double complex[:] cs):
    cdef unsigned int i, length, n
    cdef double complex z, c
    cdef int[:] output = np.empty(len(zs), dtype=np.int32)
    length = len(zs)
    with nogil, parallel():
        for i in prange(length, schedule="guided"):
            z = zs[i]
            c = cs[i]
            n = 0
            while n < maxiter and (z.real * z.real + z.imag * z.imag) < 4:
                z = z * z + c
                n = n + 1
            output[i] = n
    return output
Runtime 0.05s
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
ShedSkin 0.9.4 annotations
def calculate_z(maxiter, zs, cs):        # maxiter: [int], zs: 
                           [list(complex)], cs: [list(complex)]
    output = [0] * len(zs)               # [list(int)]
    for i in range(len(zs)):             # [__iter(int)]
        n = 0                            # [int]
        z = zs[i]                        # [complex]
        c = cs[i]                        # [complex]
        while n < maxiter and (… <4):    # [complex]
            z = z * z + c                # [complex]
            n += 1                       # [int]
        output[i] = n                    # [int]
    return output                        # [list(int)]
Couldn't we generate Cython pyx? Runtime 0.22s
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Pythran (0.40)
#pythran export calculate_z_serial_purepython(int, 
complex list, complex list)
def calculate_z_serial_purepython(maxiter, zs, cs):
 … 
Support for OpenMP on numpy arrays
Author Serge made an overnight fix – superb
support!
List Runtime 0.4s
#pythran export calculate_z(int, complex[], complex[], int[])
… 
#omp parallel for schedule(dynamic)
OMP numpy Runtime 0.10s
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
PyPy nightly (and numpypy)
●
“It just works” on Python 2.7 code
●
Clever list strategies (e.g. unboxed, uniform)
●
Little support for pre-existing C extensions (e.g.
the existing numpy)
●
multiprocessing, IPython etc all work fine
●
Python list code runtime: 0.3s
●
(pypy)numpy support is incomplete, bugs are
tackled (numpy runtime 5s [CPython+numpy 56s])
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Numba 0.12
from numba import jit
@jit(nopython=True)
def calculate_z_serial_purepython(maxiter, zs, cs, output):
    # couldn't create output, had to pass it in
    # output = numpy.zeros(len(zs), dtype=np.int32)
    for i in xrange(len(zs)):
        n = 0
        z = zs[i]
        c = cs[i]
        #while n < maxiter and abs(z) < 2:  # abs unrecognised
        while n < maxiter and z.real * z.real + z.imag * z.imag < 4:
            z = z * z + c
            n += 1
        output[i] = n
    #return output
Runtime 0.4s
Some Python 3 support, some GPU
prange support missing (was in 0.11)?
0.12 introduces temp limitations
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Tool Tradeoffs
●
PyPy no learning curve (pure Py only) easy win?
●
ShedSkin easy (pure Py only) but fairly rare
●
Cython pure Py hours to learn – team cost low (and lots of
online help)
●
Cython numpy OMP days+ to learn – heavy team cost?
●
Numba/Pythran hours to learn, install a bit tricky (Anaconda
easiest for Numba)
●
Pythran OMP very impressive result for little effort
●
Numba big toolchain which might hurt productivity?
●
(numexpr not covered – great for numpy and easy to use)
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Wrap up
●
Our profiling options should be richer
●
4-12 physical CPU cores commonplace
●
Cost of hand-annotating code is reduced agility
●
JITs/AST compilers are getting fairly good, manual
intervention still gives best results
BUT! CONSIDER:
●
Automation should (probably) be embraced ($CPUs
< $humans) as team velocity is probably higher
Ian@MorConsulting.com @IanOzsvald
PyDataLondon February 2014
Thank You
• Ian@IanOzsvald.com
• @IanOzsvald
• MorConsulting.com
• Annotate.io
• GitHub/IanOzsvald
Ad

More Related Content

Viewers also liked (17)

The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
Spark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
Jon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
sparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Spark Summit
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Spark Summit
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebook
PyData
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Spark Summit
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
Spark Summit
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
Ahsan Javed Awan
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Krishna Sankar
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
Wes McKinney
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
Alexander Korbonits
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
Spark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Spark Summit
 
Python Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The GloryPython Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The Glory
emptysquare
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache Spark
Wes McKinney
 
The Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in SparkThe Potential of GPU-driven High Performance Data Analytics in Spark
The Potential of GPU-driven High Performance Data Analytics in Spark
Spark Summit
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
Jon Haddad
 
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production ScaleGPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
GPU Support in Spark and GPU/CPU Mixed Resource Scheduling at Production Scale
sparktc
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Spark Summit
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Making Sense of Spark Performance-(Kay Ousterhout, UC Berkeley)
Spark Summit
 
Making your code faster cython and parallel processing in the jupyter notebook
Making your code faster   cython and parallel processing in the jupyter notebookMaking your code faster   cython and parallel processing in the jupyter notebook
Making your code faster cython and parallel processing in the jupyter notebook
PyData
 
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production ScaleGPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
GPU Support In Spark And GPU/CPU Mixed Resource Scheduling At Production Scale
Spark Summit
 
Getting The Best Performance With PySpark
Getting The Best Performance With PySparkGetting The Best Performance With PySpark
Getting The Best Performance With PySpark
Spark Summit
 
Boosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of TechniquesBoosting spark performance: An Overview of Techniques
Boosting spark performance: An Overview of Techniques
Ahsan Javed Awan
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Krishna Sankar
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
Wes McKinney
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
Alexander Korbonits
 
Spark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan PuSpark Summit EU talk by Qifan Pu
Spark Summit EU talk by Qifan Pu
Spark Summit
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Spark Summit
 
Python Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The GloryPython Performance Profiling: The Guts And The Glory
Python Performance Profiling: The Guts And The Glory
emptysquare
 
High Performance Python on Apache Spark
High Performance Python on Apache SparkHigh Performance Python on Apache Spark
High Performance Python on Apache Spark
Wes McKinney
 

Similar to The High Performance Python Landscape by Ian Ozsvald (20)

Open Source Monitoring in 2015
Open Source Monitoring in 2015Open Source Monitoring in 2015
Open Source Monitoring in 2015
Kris Buytaert
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
 
Python for scientific computing
Python for scientific computingPython for scientific computing
Python for scientific computing
Go Asgard
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019
Kris Buytaert
 
Tutorial on-python-programming
Tutorial on-python-programmingTutorial on-python-programming
Tutorial on-python-programming
Chetan Giridhar
 
GDG Helwan Introduction to python
GDG Helwan Introduction to pythonGDG Helwan Introduction to python
GDG Helwan Introduction to python
Mohamed Hegazy
 
From MonitoringSucks to Monitoring Love , 2016 Edition
From MonitoringSucks to Monitoring Love , 2016 EditionFrom MonitoringSucks to Monitoring Love , 2016 Edition
From MonitoringSucks to Monitoring Love , 2016 Edition
Kris Buytaert
 
Introduction to Raspberry Pi and GPIO
Introduction to Raspberry Pi and GPIOIntroduction to Raspberry Pi and GPIO
Introduction to Raspberry Pi and GPIO
Kris Findlay
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)
wesley chun
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdf
Hendri Karisma
 
PyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtimePyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtime
National Cheng Kung University
 
Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8
Nikita Popov
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
Yung-Yu Chen
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
Takeshi Akutsu
 
Monitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code AgeMonitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code Age
Kris Buytaert
 
web programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Malothweb programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Maloth
Bhavsingh Maloth
 
Python at yhat (august 2013)
Python at yhat (august 2013)Python at yhat (august 2013)
Python at yhat (august 2013)
Austin Ogilvie
 
PYTHON by kunal.pptx
PYTHON by kunal.pptxPYTHON by kunal.pptx
PYTHON by kunal.pptx
kunalGoyal89
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummies
Tatiana Al-Chueyr
 
Open Source Monitoring in 2015
Open Source Monitoring in 2015Open Source Monitoring in 2015
Open Source Monitoring in 2015
Kris Buytaert
 
Euro python2011 High Performance Python
Euro python2011 High Performance PythonEuro python2011 High Performance Python
Euro python2011 High Performance Python
Ian Ozsvald
 
Python for scientific computing
Python for scientific computingPython for scientific computing
Python for scientific computing
Go Asgard
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019
Kris Buytaert
 
Tutorial on-python-programming
Tutorial on-python-programmingTutorial on-python-programming
Tutorial on-python-programming
Chetan Giridhar
 
GDG Helwan Introduction to python
GDG Helwan Introduction to pythonGDG Helwan Introduction to python
GDG Helwan Introduction to python
Mohamed Hegazy
 
From MonitoringSucks to Monitoring Love , 2016 Edition
From MonitoringSucks to Monitoring Love , 2016 EditionFrom MonitoringSucks to Monitoring Love , 2016 Edition
From MonitoringSucks to Monitoring Love , 2016 Edition
Kris Buytaert
 
Introduction to Raspberry Pi and GPIO
Introduction to Raspberry Pi and GPIOIntroduction to Raspberry Pi and GPIO
Introduction to Raspberry Pi and GPIO
Kris Findlay
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)
wesley chun
 
Python 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdfPython 101 - Indonesia AI Society.pdf
Python 101 - Indonesia AI Society.pdf
Hendri Karisma
 
PyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtimePyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtime
National Cheng Kung University
 
Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8Just-In-Time Compiler in PHP 8
Just-In-Time Compiler in PHP 8
Nikita Popov
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
Yung-Yu Chen
 
On the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of PythonOn the Necessity and Inapplicability of Python
On the Necessity and Inapplicability of Python
Takeshi Akutsu
 
Monitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code AgeMonitoring Drupal In an Infrastructure as Code Age
Monitoring Drupal In an Infrastructure as Code Age
Kris Buytaert
 
web programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Malothweb programming UNIT VIII python by Bhavsingh Maloth
web programming UNIT VIII python by Bhavsingh Maloth
Bhavsingh Maloth
 
Python at yhat (august 2013)
Python at yhat (august 2013)Python at yhat (august 2013)
Python at yhat (august 2013)
Austin Ogilvie
 
PYTHON by kunal.pptx
PYTHON by kunal.pptxPYTHON by kunal.pptx
PYTHON by kunal.pptx
kunalGoyal89
 
PythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummiesPythonBrasil[8] - CPython for dummies
PythonBrasil[8] - CPython for dummies
Tatiana Al-Chueyr
 
Ad

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 
Ad

Recently uploaded (20)

Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 

The High Performance Python Landscape by Ian Ozsvald

  • 1. www.morconsulting.c The High Performance Python Landscape - profiling and fast calculation Ian Ozsvald @IanOzsvald MorConsulting.com
  • 2. [email protected] @IanOzsvald PyDataLondon February 2014 What is “high performance”? ● Profiling to understand system behaviour ● We often ignore this step... ● Speeding up the bottleneck ● Keeps you on 1 machine (if possible) ● Keeping team speed high
  • 3. [email protected] @IanOzsvald PyDataLondon February 2014 “High Performance Python” • “Practical Performant Programming for Humans” • Please join the mailing list via IanOzsvald.com
  • 5. [email protected] @IanOzsvald PyDataLondon February 2014 line_profiler Line #      Hits         Time  Per Hit   % Time  Line Contents ==============================================================      9                                           @profile     10                                           def calculate_z_serial_purepython(                                                       maxiter, zs, cs):     12         1         6870   6870.0      0.0      output = [0] * len(zs)     13   1000001       781959      0.8      0.8      for i in range(len(zs)):     14   1000000       767224      0.8      0.8          n = 0     15   1000000       843432      0.8      0.8          z = zs[i]     16   1000000       786013      0.8      0.8          c = cs[i]     17  34219980     36492596      1.1     36.2          while abs(z) < 2                                                                 and n < maxiter:     18  33219980     32869046      1.0     32.6              z = z * z + c     19  33219980     27371730      0.8     27.2              n += 1     20   1000000       890837      0.9      0.9          output[i] = n     21         1            4      4.0      0.0      return output
  • 6. [email protected] @IanOzsvald PyDataLondon February 2014 memory_profiler Line #    Mem usage    Increment   Line Contents ================================================      9   89.934 MiB    0.000 MiB   @profile     10                             def calculate_z_serial_purepython(                                                      maxiter, zs, cs):                                       12   97.566 MiB    7.633 MiB       output = [0] * len(zs)     13  130.215 MiB   32.648 MiB       for i in range(len(zs)):     14  130.215 MiB    0.000 MiB           n = 0     15  130.215 MiB    0.000 MiB           z = zs[i]     16  130.215 MiB    0.000 MiB           c = cs[i]     17  130.215 MiB    0.000 MiB           while n < maxiter and abs(z) < 2:     18  130.215 MiB    0.000 MiB               z = z * z + c     19  130.215 MiB    0.000 MiB               n += 1     20  130.215 MiB    0.000 MiB           output[i] = n     21  122.582 MiB   ­7.633 MiB       return output
  • 7. [email protected] @IanOzsvald PyDataLondon February 2014 memory_profiler mprof https://github.com/scikit-learn/scikit-l earn/pull/2248 Before & After an improvement
  • 8. [email protected] @IanOzsvald PyDataLondon February 2014 Transforming memory_profiler into a resource profiler?
  • 9. [email protected] @IanOzsvald PyDataLondon February 2014 Profiling possibilities ● CPU (line by line or by function) ● Memory (line by line) ● Disk read/write (with some hacking) ● Network read/write (with some hacking) ● mmaps ● File handles ● Network connections ● Cache utilisation via libperf?
  • 10. [email protected] @IanOzsvald PyDataLondon February 2014 Cython 0.20 (pyx annotations) #cython: boundscheck=False def calculate_z(int maxiter, zs, cs):     """Calculate output list using Julia update rule"""     cdef unsigned int i, n     cdef double complex z, c     output = [0] * len(zs)     for i in range(len(zs)):         n = 0         z = zs[i]         c = cs[i]         while n < maxiter and (z.real * z.real + z.imag * z.imag) < 4:             z = z * z + c             n += 1         output[i] = n     return output Pure CPython lists code 12s Cython lists runtime 0.19s Cython numpy runtime 0.16s
  • 11. [email protected] @IanOzsvald PyDataLondon February 2014 Cython + numpy + OMP nogil #cython: boundscheck=False from cython.parallel import parallel, prange import numpy as np cimport numpy as np def calculate_z(int maxiter, double complex[:] zs, double complex[:] cs):     cdef unsigned int i, length, n     cdef double complex z, c     cdef int[:] output = np.empty(len(zs), dtype=np.int32)     length = len(zs)     with nogil, parallel():         for i in prange(length, schedule="guided"):             z = zs[i]             c = cs[i]             n = 0             while n < maxiter and (z.real * z.real + z.imag * z.imag) < 4:                 z = z * z + c                 n = n + 1             output[i] = n     return output Runtime 0.05s
  • 12. [email protected] @IanOzsvald PyDataLondon February 2014 ShedSkin 0.9.4 annotations def calculate_z(maxiter, zs, cs):        # maxiter: [int], zs:                             [list(complex)], cs: [list(complex)]     output = [0] * len(zs)               # [list(int)]     for i in range(len(zs)):             # [__iter(int)]         n = 0                            # [int]         z = zs[i]                        # [complex]         c = cs[i]                        # [complex]         while n < maxiter and (… <4):    # [complex]             z = z * z + c                # [complex]             n += 1                       # [int]         output[i] = n                    # [int]     return output                        # [list(int)] Couldn't we generate Cython pyx? Runtime 0.22s
  • 13. [email protected] @IanOzsvald PyDataLondon February 2014 Pythran (0.40) #pythran export calculate_z_serial_purepython(int,  complex list, complex list) def calculate_z_serial_purepython(maxiter, zs, cs):  …  Support for OpenMP on numpy arrays Author Serge made an overnight fix – superb support! List Runtime 0.4s #pythran export calculate_z(int, complex[], complex[], int[]) …  #omp parallel for schedule(dynamic) OMP numpy Runtime 0.10s
  • 14. [email protected] @IanOzsvald PyDataLondon February 2014 PyPy nightly (and numpypy) ● “It just works” on Python 2.7 code ● Clever list strategies (e.g. unboxed, uniform) ● Little support for pre-existing C extensions (e.g. the existing numpy) ● multiprocessing, IPython etc all work fine ● Python list code runtime: 0.3s ● (pypy)numpy support is incomplete, bugs are tackled (numpy runtime 5s [CPython+numpy 56s])
  • 15. [email protected] @IanOzsvald PyDataLondon February 2014 Numba 0.12 from numba import jit @jit(nopython=True) def calculate_z_serial_purepython(maxiter, zs, cs, output):     # couldn't create output, had to pass it in     # output = numpy.zeros(len(zs), dtype=np.int32)     for i in xrange(len(zs)):         n = 0         z = zs[i]         c = cs[i]         #while n < maxiter and abs(z) < 2:  # abs unrecognised         while n < maxiter and z.real * z.real + z.imag * z.imag < 4:             z = z * z + c             n += 1         output[i] = n     #return output Runtime 0.4s Some Python 3 support, some GPU prange support missing (was in 0.11)? 0.12 introduces temp limitations
  • 16. [email protected] @IanOzsvald PyDataLondon February 2014 Tool Tradeoffs ● PyPy no learning curve (pure Py only) easy win? ● ShedSkin easy (pure Py only) but fairly rare ● Cython pure Py hours to learn – team cost low (and lots of online help) ● Cython numpy OMP days+ to learn – heavy team cost? ● Numba/Pythran hours to learn, install a bit tricky (Anaconda easiest for Numba) ● Pythran OMP very impressive result for little effort ● Numba big toolchain which might hurt productivity? ● (numexpr not covered – great for numpy and easy to use)
  • 17. [email protected] @IanOzsvald PyDataLondon February 2014 Wrap up ● Our profiling options should be richer ● 4-12 physical CPU cores commonplace ● Cost of hand-annotating code is reduced agility ● JITs/AST compilers are getting fairly good, manual intervention still gives best results BUT! CONSIDER: ● Automation should (probably) be embraced ($CPUs < $humans) as team velocity is probably higher
  • 18. [email protected] @IanOzsvald PyDataLondon February 2014 Thank You • [email protected] • @IanOzsvald • MorConsulting.com • Annotate.io • GitHub/IanOzsvald