Skip to content

Approximate ratio masks significant benchmark changes #752

@h-vetinari

Description

@h-vetinari

ASV sometimes adds a ~ to ratios if there are many sample (with high variance, I'm guessing), but there's a bug somewhere how these results are evaluated when the significant changes are given out.

As an example from running the ASV in a pandas PR:

(pandas-dev) C:\[...]\pddev\asv_bench>asv continuous -f 1.1 upstream/master 0d3c6d21 -b ^strings.Cat
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Building 0d3c6d21 <cln_str_cat~2> for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
·· Installing 0d3c6d21 <cln_str_cat~2> into conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
· Running 2 total benchmarks (2 commits * 1 environments * 1 benchmarks)
[  0.00%] · For pandas commit b28cf5aa <master> (round 1/2):
[  0.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[  0.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 25.00%] ··· Running (strings.Cat.time_cat--).
[ 25.00%] · For pandas commit 0d3c6d21 <cln_str_cat~2> (round 1/2):
[ 25.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 25.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 50.00%] ··· Running (strings.Cat.time_cat--).
[ 50.00%] · For pandas commit 0d3c6d21 <cln_str_cat~2> (round 2/2):
[ 50.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 75.00%] ··· strings.Cat.time_cat                                                                                                                                                ok
[ 75.00%] ··· ============ ====== ============ ============== ============= ========== ============ ===========
              --                                                 na_rep / na_frac
              ------------------- -----------------------------------------------------------------------------
               other_cols   sep    None / 0.0   None / 0.001   None / 0.15   - / 0.0    - / 0.001     - / 0.15
              ============ ====== ============ ============== ============= ========== ============ ===========
                   0        None    7.81±2ms     10.9±0.8ms     11.7±0.8ms   9.38±0ms   10.9±0.6ms    12.5±0ms
                   0         ,      9.38±0ms      10.9±0ms      11.7±0.8ms   9.38±0ms    10.9±0ms     12.5±0ms
                   3        None    46.9±8ms      62.5±0ms       62.5±6ms    54.7±8ms    62.5±8ms    78.1±10ms
                   3         ,      78.1±0ms      78.1±0ms       62.5±8ms    78.1±8ms    78.1±0ms     93.8±0ms
              ============ ====== ============ ============== ============= ========== ============ ===========

[ 75.00%] · For pandas commit b28cf5aa <master> (round 2/2):
[ 75.00%] ·· Building for conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[ 75.00%] ·· Benchmarking conda-py3.6-Cython-matplotlib-numexpr-numpy-openpyxl-pytables-pytest-scipy-sqlalchemy-xlrd-xlsxwriter-xlwt
[100.00%] ··· strings.Cat.time_cat                                                                                                                                                ok
[100.00%] ··· ============ ====== ============ ============== ============= ========== ============ ==========
              --                                                na_rep / na_frac
              ------------------- ----------------------------------------------------------------------------
               other_cols   sep    None / 0.0   None / 0.001   None / 0.15   - / 0.0    - / 0.001    - / 0.15
              ============ ====== ============ ============== ============= ========== ============ ==========
                   0        None    10.9±0ms      9.38±0ms      10.9±0.8ms   9.38±0ms   9.38±0.6ms   10.9±0ms
                   0         ,     9.38±0.6ms    14.1±0.8ms     14.1±0.6ms   9.38±0ms    9.38±0ms    10.9±0ms
                   3        None    62.5±6ms      46.9±6ms       46.9±0ms    54.7±8ms    46.9±8ms    62.5±0ms
                   3         ,      46.9±6ms      46.9±8ms       46.9±6ms    78.1±0ms    70.3±8ms    93.8±8ms
              ============ ====== ============ ============== ============= ========== ============ ==========


BENCHMARKS NOT SIGNIFICANTLY CHANGED.

Capital letters telling me everything is fine? Not so much unfortunately... (stars added manually)

(pandas-dev) C:\[...]\pddev\asv_bench>asv compare b28cf5aa 0d3c6d21

All benchmarks:

       before           after         ratio
     [b28cf5aa]       [0d3c6d21]
     <master>         <cln_str_cat~2>
         9.38±0ms         9.38±0ms       1.00    strings.Cat.time_cat(0, ',', '-', 0.0)
         9.38±0ms         10.9±0ms    **~1.17**  strings.Cat.time_cat(0, ',', '-', 0.001)
         10.9±0ms         12.5±0ms    **~1.14**  strings.Cat.time_cat(0, ',', '-', 0.15)
       9.38±0.6ms         9.38±0ms       1.00    strings.Cat.time_cat(0, ',', None, 0.0)
       14.1±0.8ms         10.9±0ms    **~0.78**  strings.Cat.time_cat(0, ',', None, 0.001)
       14.1±0.6ms       11.7±0.8ms    **~0.83**  strings.Cat.time_cat(0, ',', None, 0.15)
         9.38±0ms         9.38±0ms       1.00    strings.Cat.time_cat(0, None, '-', 0.0)
       9.38±0.6ms       10.9±0.6ms    **~1.17**  strings.Cat.time_cat(0, None, '-', 0.001)
         10.9±0ms         12.5±0ms    **~1.14**  strings.Cat.time_cat(0, None, '-', 0.15)
         10.9±0ms         7.81±2ms    **~0.71**  strings.Cat.time_cat(0, None, None, 0.0)
         9.38±0ms       10.9±0.8ms    **~1.17**  strings.Cat.time_cat(0, None, None, 0.001)
       10.9±0.8ms       11.7±0.8ms       1.07    strings.Cat.time_cat(0, None, None, 0.15)
         78.1±0ms         78.1±8ms       1.00    strings.Cat.time_cat(3, ',', '-', 0.0)
         70.3±8ms         78.1±0ms    **~1.11**  strings.Cat.time_cat(3, ',', '-', 0.001)
         93.8±8ms         93.8±0ms       1.00    strings.Cat.time_cat(3, ',', '-', 0.15)
         46.9±6ms         78.1±0ms    **~1.67**  strings.Cat.time_cat(3, ',', None, 0.0)
         46.9±8ms         78.1±0ms    **~1.67**  strings.Cat.time_cat(3, ',', None, 0.001)
         46.9±6ms         62.5±8ms    **~1.33**  strings.Cat.time_cat(3, ',', None, 0.15)
         54.7±8ms         54.7±8ms       1.00    strings.Cat.time_cat(3, None, '-', 0.0)
         46.9±8ms         62.5±8ms    **~1.33**  strings.Cat.time_cat(3, None, '-', 0.001)
         62.5±0ms        78.1±10ms    **~1.25**  strings.Cat.time_cat(3, None, '-', 0.15)
         62.5±6ms         46.9±8ms    **~0.75**  strings.Cat.time_cat(3, None, None, 0.0)
         46.9±6ms         62.5±0ms    **~1.33**  strings.Cat.time_cat(3, None, None, 0.001)
         46.9±0ms         62.5±6ms    **~1.33**  strings.Cat.time_cat(3, None, None, 0.15)

It just so happened that all the significant changes were hiding behind a tilde...

PS. I just installed from github a few days ago, I'm on 0.4.dev1533+5f9be296.

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-infoAdditional information needed to progress.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions