Reproducibility and experiments management in Machine Learning

Simple. Precise.
Competent.
Reproducibility and
experiments management in
ML projects
D1 EXPO 2019, Alicante
Rozhkov Mikhail
Senior Data Scientist
Raiffeisen Bank Russia

Topics
• Difference of ML projects from IT
projects
• ML experiments management
• Agile ML
• ML reproducibility and why it’s
important?
• Approaches and tools
2

1. Different from IT projects
2. Longer dev cycle
3. Experiments driven
4. Not easy to test and validate
3
ML projects

IT projects development processes
4
Source: https://online.husson.edu/software-development-cycle/ Source: https://en.wikipedia.org/wiki/Software_development_process

ML project workflow is experiments driven
5
Problem
Statement
MVP
design
Get data
Prepare data
Train model
Evaluate
modelTest &
Integrate
Serve /
Predict
Monitor
1. Analyze &
Plan
2. Prototype
4. Monitor &
Maintain
3. Productionize
Inspired by Uber’s workflow of a machine learning project diagram. Scaling Machine Learning at Uber with Michelangelo https://eng.uber.com/scaling-michelangelo/
Solution
development

Experiment = code + dataset + outputs
6
Algorithm
Data
Hyperpara
meters
Evaluation
Measure
Model
ETL
tasks
test
dataset
train
dataset
evaluate
train
Experiment
config - artifacts
- pipelines
- code
- conﬁgs

ML project requires more factors to take into account
7
Software ML
Architecture design
tasks, UI/UX
integrations
+ nature and quality of data
Quality measures working code
+ model quality metrics
+ performance in production
Version control
code
environment
+ pipelines
+ datasets
+ models & artifacts
Testing code
+ data and features
+ model development methods
+ ML infrastructure
+ ML systems
Inspired by Dmitry Petrov, Ivan Shcheklein. Open source tools for machine learning model and dataset versioning.

1. Satisfy customer
2. Early fail
3. Fail safe
4. Frequent code updates
5. Constant changing requirements
6. Team = Business + DS/ML
7. Frequent team meetings/statuses
8. Measure of progress=working
code
9. Technical excellence and good
design
10. Reproducibility
8
Agile ML
Inspired by: Andrew Kelleher, Adam Kelleher. Machine Learning in Production:
Developing and Optimizing Data Science Workflows and Applications. 2019

ML reproducibility is a dimension of quality
9
What is Reproducibility?
● using the original methods applied to
the original data to produce the
original results [Gardner]
Why should you care?
● Trust
● Consistent Results
● Versioned History
● Team Performance
● Pain Less Production
Josh Gardner, Yuming Yang, Ryan S. Baker, Christopher Brooks. Enabling End-To-End Machine
Learning Replicability: A Case Study in Educational Data Mining

● Code, models and data version
control
● Automated pipelines
● Tests
● Control environment
● Experiments management
● Methodology and procedures
documentation
10
Reproducible ML

Onsite Recommendation System
12
Purpose: improve conversion rate on landing page
send online
user data
get promo
User History
DataCV prediction
model
Promo
recommendati
on model
Promo DB
{uid, cv_pred, promo_id}

Tracking project statuses and issues, documentation
13

Code
14
● Version control
● Re-usable .py modules
● Tests...
Source: https://www.bitbull.it/en/blog/how-git-flow-works/

Data and artifacts
15
● Version Control
● Store / share
● Access

Pipelines
16
● One button run
● End-to-end or selected steps
● Configs (i.e. random seeds)
load/
transform raw
data
evaluate
train
split train/test
prepare train
dataset
select best
model
prepare test
dataset
predict

Experiments Management
17
● Browse history
● Compare results
● Share results
● Methodology and
procedures
Data Model
pipelines
MetricsConfig
pipelines
pipelines
Data Model
pipelines
MetricsConfig
pipelines
pipelines
experiment X (dd.mm.2018)
experiment Y (dd.mm.2019)

Environment
18
● Libraries
● OS
● Hardware
Gronenschild EHBM, Habets P, Jacobs HIL, Mengelers R, Rozendaal N, et al. (2012) The Effects of FreeSurfer Version, Workstation Type, and Macintosh
Operating System Version on Anatomical Volume and Cortical Thickness Measurements. PLOS ONE 7(6): e38234. https://doi.org/10.1371/journal.pone.0038234
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0038234
Example: Effects of data processing conditions on the voxel volumes for a subsample of
(sub)cortical structures

Test ML
● Tests for data and features
● Tests for model development
● Tests for ML infrastructure
● Test for running ML systems
19

Conclusions
1. ML projects requires different
approach
2. Data and experiments are crucial
3. Agility is driven by fast
experimenting and reproducibility
4. Experiments are versioned,
browsable, comparable,
documented, reproducible
5. Reproducibility is a dimension of
quality and maturity
20

raiffeisen.ru
Simple. Precise.
Competent.
Thank you
21
Rozhkov Mikhail
Senior Data Scientist
Raiffeisen Bank Russia
mikhail.rozhkov@raiffeisen.ru

Reproducibility and experiments management in Machine Learning

Recommended

More Related Content

Similar to Reproducibility and experiments management in Machine Learning (20)

More from Mikhail Rozhkov (15)

Recently uploaded (20)

Reproducibility and experiments management in Machine Learning