SlideShare a Scribd company logo
Technology
An Agile
Approach to
Machine
Learning
Randy Shoup
VP Engineering
Background
@randyshoup
Technology
1. The Problem
What problem are
you trying to solve?
Agree on what you
are optimizing
Technology @randyshoup
• aka “Optimization Function” or “One
Metric That Matters”
• Discussing and agreeing on this metric
is itself valuable
• Only very few metrics, preferably one
Overall Evaluation
Criterion (OEC)
• E.g., Actions vs. click rate
• E.g., Long-term customer value vs.
short-term revenue
• “Pirate metrics” (AARRR): Acquisition,
Activation, Retention, Revenue,
Referral
Aligned to Business
Value
• Validated by data science, not solely
chosen by product / business
• Look for predictive leading indicators
• Avoid lagging indicators and vanity
metrics
Valid and
Measurable
Evaluating Success
Problem
“A problem
well-stated
is a problem
half-solved.”
-- Charles Kettering,
head of research at GM
Technology
Problem Difficulty
Problem
https://xkcd.com/1425/
Technology
2. The Data
Technology @randyshoup
• Many events, only predictive in
aggregate
• E.g., web search queries, ecommerce
clickstream, Netflix viewing metrics
Big but Shallow
• Few events, each of which is significant
• E.g., ecommerce purchases, WeWork
event attendance
Small but Deep
Characterizing Your Data
Data
Better data beats a
smarter algorithm
Technology @randyshoup
• Missing data, partial data
• Improperly or inconsistently formatted
Clean Data
• Consolidated into a single (logical)
location so it can be processed or
analyzed
• Joined together (“enriched”) with other
data sources
Aggregated Data
• Tagged by humans with one or more
labels
• Required to train supervised models
• Complicated and expensive at scale
Labeled Data
Better Data
Data
Technology @randyshoup
• More potentially useful attributes
• More data sources
• Longer retention
More Data
• Data pipeline to automate collection and
aggregation
• Move from large batch to mini-batch to
streaming data
Timely Data
Better Data
Data
“Data preparation accounts
for about 80% of the work of
data scientists.” – CrowdFlower survey,
2016
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#2d58f4ab6f63
Technology
3. The Algorithms
Technology @randyshoup
• Encode expert knowledge
• Simple set of imperative if-then-else
statements
• Brittle and primitive
• Surprisingly effective
Rules and Heuristics
• Regression
• Decision trees / forests
• Collaborative filtering
• May be all you need
Simple Algorithms
• Iterative Optimization / Dynamic
Programming
• Neural nets
• Deep learning
• Only when absolutely required
Advanced Techniques
Algorithmic Evolution
Algorithms
Technology @randyshoup
• Many real-world problems are best
solved through a combination of several
algorithms
• E.g., Netflix Prize
Portfolio / Ensemble
Approaches
Algorithmic Evolution
Algorithms
Technology
Model
Execution
Online Model
Execution
Algorithms
Deploy Model
Collect Data
Train Model✅
Usage
@randyshoup
Technology
Offline Model
Building
Algorithms
Model
Execution
✅
Model
Building
Try New
Model
✅
@randyshoup
Technology @randyshoup
• Many common algorithms are highly
accurate, but difficult to interpret
• Model can make a decision, but ew
cannot “explain” its decision
• Particularly important in context of
system bias
• (+) Decision trees / forests, linear
regression
• (-) Neural nets, Deep Learning
Interpretability /
Explainability
• Enable data scientists to be self-
sufficient in experimenting, building,
training, and deploying
• End-to-end responsibility for models in
production
• Write models, deploy models, monitor
model performance
DevOps for
Data Science
• Platform-as-a-service for data scientists
• Programming model that matches the
workflow of a data scientist
• Abstract away infrastructure and other
details
Algorithm
Platform
Scaling Algorithm Development
Algorithms
Technology @randyshoup
• Data scientists spin up their own resources
• Both ad-hoc execution and repeatable pipelines
• Data science-friendly programming model exposes ETL and
Matrix transforms
• Abstracts away storage (S3), computation (Docker and ECS), and
the model building pipeline (Spark)
Algorithm Platform-as-a-Service
Algorithms
Technology
4. The Experiments
“It doesn’t matter how
beautiful your theory is.
It doesn’t matter how
smart you are.
If it doesn’t agree with
experiment, it’s wrong.”
-- Richard Feynman
Technology @randyshoup
• What metrics do you expect to move,
and why
• Understand your baseline
1. State Your
Hypothesis
• Sample size based on effect size
• Separate control and treatment groups,
test for bias
• Split traffic between control and
treatment
2. Design a Real A|B
Test
• Understand customer and system
behavior
• Understand why this experiment worked
or did not
3. Obsessively Log and
Measure
Designing and Running
Experimental Discipline
Technology @randyshoup
• Data trumps hope and intuition
• Develop insights for the next experiment
4. Listen to the
Data
• This is a journey, not a single step
5. Rinse and Repeat
Designing and Running
Experimental Discipline
Technology @randyshoup
Listen to the Data
Experimental Discipline
• 1/3 of ideas were positive and
statistically significant
• 1/3 of ideas were flat: no
statistically significant difference
• 1/3 of ideas were negative and
statistically significant
https://exp-platform.com/experiments-at-microsoft/
“Being wrong isn’t a bad
thing, like they teach
you in school. It is an
opportunity to learn
something.”
-- Richard Feynman
Technology @randyshoup
• Low-risk, push-button deployment
• Rapid release cadence
• Rapid rollback and recovery
Repeatable Deployment
Pipeline
• Faster to repair
• Easier to understand
• Simpler to diagnose
Smaller Units of Work
• Changes can be rolled out and rolled
back
• Learnings can be applied in the next
experiment
Enables
Experimentation
Continuous Delivery
Experimental Discipline
Technology @randyshoup
• Flag controls whether feature is “on” for
a particular set of users
• Independently discovered at eBay,
Yahoo, Google
• Decouple feature delivery from code
delivery
Enable / Disable feature
via configuration
• Develop / test / verify in production
• Rapid on or off for any reason
Makes Speed Safe
• Overall experiment controlled by feature
flag
• Control vs. treatment
Enables
Experimentation
Feature Flags
Experimental Discipline
● Ranking function for search results
○ Small number of hand-tuned factors  Thousands of factors
● Incremental Experimentation
○ Predictive models: query->view, view->purchase, etc.
○ Hundreds of parallel A | B tests
○ Full year of steady, incremental improvements
 2% increase in eBay revenue (~$120M / year)
@randyshoup
Machine-Learned Ranking
● Reduce user-experienced latency for search results
● Iterative Process
○ Implement a potential improvement
○ Release to the site in an A | B test
○ Monitor metrics –time to first byte, time to click, click rate, purchase rate
 2% increase in eBay revenue (~$120M / year)
@randyshoup
Site Speed
The most
dangerous
animal is the
“HiPPO”
Technology 33
Putting it All Together
Technology
Event Recommendations
WeWork Member Experience
Member Knowledge
Graph
Skills and
Interests
Event Feedback
Event Recommender
Predictive
Model
@randyshoup
Technology
Event Recipes
WeWork Member Experience
Event Recommender
Predictive
Model
@randyshoup
Technology
Get the predicted
opening occupancy
based on the
recommended 1-Click
price
Adjust the price to see how
occupancy will change
Occupancy Predictor
WeWork Revenue Optimization
@randyshoup
Technology
Revenue Simulation
WeWork Revenue Optimization
@randyshoup
Technology
Office Attributes Based Pricing
Corner office (premium)
Offices with high quality
views (premium)
Calculate and recommend
premium and discounts for
key office attributes
WeWork Revenue Optimization
@randyshoup
Technology
Example: Recommend alternative usage for unoccupied spaces
Fully optimize inventory usage by
leveraging demand and
profitability predictions
Inventory Management
WeWork Revenue Optimization
@randyshoup
Technology
Automatically lay out desk
configuration given space
constraints
Automated Layout
WeWork Applied Science
@randyshoup
Technology 41
Takeaways
Technology @randyshoup
• Identify and frame a clear business
problem
• … that matters to customers or the
business
• Define clear metric(s) for success
1. Drive from Business
Needs
• Single problem
• Solve problem end-to-end
• Show business results
2. Start Small
• Data collection and storage
• Data cleanliness and preparation
• Reliable, accurate, timely data pipeline
• Better data beats a better model (!)
3. Data Matters
Takeaways
An Agile Approach to Machine Learning
Technology @randyshoup
• Start with a Hypothesis
• Design an Experiment
• Separate Control and Experiment
group(s)
• Measure business metric for A vs. B
• Learn and Decide
4. A | B Testing
Discipline
• Simple model / No model
• Rules and Heuristics
• Gradually increase sophistication with
more data and more experience
5. Iteratively Refine
Model
• Find broader applicability across the
business
• Apply to more and more problems
• Move “upstream” in the development
process
6. Iteratively Expand
Applications
Takeaways
An Agile Approach to Machine Learning
Technology @randyshoup
• Make decisions with data instead of
guesswork and intuition
• Avoid HiPPO decisionmaking
• Can be threatening to designers,
product managers, decisionmakers
7. Data-Driven Culture
• Set of tools in our toolbox
• Sometimes valuable and useful
• Not a panacea
• Not a substitute for thinking 
8. Machine Learning is
not Magic
Takeaways
An Agile Approach to Machine Learning
Technology
New York
San Francisco
Tel Aviv
Shanghai
Singapore
Seattle
Palo Alto
Questions?
@randyshoup

More Related Content

What's hot (20)

PPTX
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
Gene Kim
 
PPTX
Why Enterprises Are Embracing the Cloud
Randy Shoup
 
PPTX
Minimal Viable Architecture - Silicon Slopes 2020
Randy Shoup
 
PPTX
Evolving Architecture and Organization - Lessons from Google and eBay
Randy Shoup
 
PPTX
A CTO's Guide to Scaling Organizations
Randy Shoup
 
PPTX
Pragmatic Microservices
Randy Shoup
 
PPTX
One Terrible Day at Google, and How It Made Us Better
Randy Shoup
 
PPTX
DevOpsDays Silicon Valley 2014 - The Game of Operations
Randy Shoup
 
PPTX
Scaling Your Architecture with Services and Events
Randy Shoup
 
PPTX
The Importance of Culture: Building and Sustaining Effective Engineering Org...
Randy Shoup
 
PPTX
Learning from Learnings: Anatomy of Three Incidents
Randy Shoup
 
PPTX
Managing Data at Scale - Microservices and Events
Randy Shoup
 
PPTX
Anatomy of Three Incidents -- Commonalities and Lessons
Randy Shoup
 
PPTX
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Randy Shoup
 
PPT
Teaching Machines to Fish -- How eBay Improves Itself
Randy Shoup
 
PDF
Tales from the Platform Trade
William Grosso
 
PPTX
Serverless Toronto helps Startups
Daniel Zivkovic
 
PDF
Velocity Conference NYC 2014 - Real World DevOps
Rodrigo Campos
 
PPTX
2015 Mastering SAP Tech - Enterprise Mobility - Testing Lessons Learned
Eneko Jon Bilbao
 
PDF
Supersize me: Making Drupal go large
Tom Phethean
 
DOES15 - Randy Shoup - Ten (Hard-Won) Lessons of the DevOps Transition
Gene Kim
 
Why Enterprises Are Embracing the Cloud
Randy Shoup
 
Minimal Viable Architecture - Silicon Slopes 2020
Randy Shoup
 
Evolving Architecture and Organization - Lessons from Google and eBay
Randy Shoup
 
A CTO's Guide to Scaling Organizations
Randy Shoup
 
Pragmatic Microservices
Randy Shoup
 
One Terrible Day at Google, and How It Made Us Better
Randy Shoup
 
DevOpsDays Silicon Valley 2014 - The Game of Operations
Randy Shoup
 
Scaling Your Architecture with Services and Events
Randy Shoup
 
The Importance of Culture: Building and Sustaining Effective Engineering Org...
Randy Shoup
 
Learning from Learnings: Anatomy of Three Incidents
Randy Shoup
 
Managing Data at Scale - Microservices and Events
Randy Shoup
 
Anatomy of Three Incidents -- Commonalities and Lessons
Randy Shoup
 
Flowcon2013 - Virtuous Cycles of Velocity: What I Learned About Going Fast at...
Randy Shoup
 
Teaching Machines to Fish -- How eBay Improves Itself
Randy Shoup
 
Tales from the Platform Trade
William Grosso
 
Serverless Toronto helps Startups
Daniel Zivkovic
 
Velocity Conference NYC 2014 - Real World DevOps
Rodrigo Campos
 
2015 Mastering SAP Tech - Enterprise Mobility - Testing Lessons Learned
Eneko Jon Bilbao
 
Supersize me: Making Drupal go large
Tom Phethean
 

Similar to An Agile Approach to Machine Learning (20)

PDF
Disrupting with Data: Lessons from Silicon Valley
Anand Rajaraman
 
PPTX
Agile bringing Big Data & Analytics closer
Nitin Khattar
 
PDF
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Amazon Web Services Korea
 
PDF
Accretive Health - Quality Management in Health Care
AccretiveHealth
 
PDF
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
BigML, Inc
 
PDF
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Tin Ho
 
PPTX
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
PPTX
UCLA MSBA LinkedIn Industry Seminar 2019-02-27
Jimmy Wong
 
PDF
Demystifying ML/AI
Matthew Reynolds
 
PDF
Data Driven Engineering 2014
Roger Barga
 
PDF
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 
PPTX
Data Leaders Summit Barcelona 2018
Harvinder Atwal
 
PDF
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 
PDF
Pragmatic Machine Learning @ ML Spain
Louis Dorard
 
PPTX
Borys Pratsiuk "How to be NVidia partner"
Lviv Startup Club
 
PDF
Industrial Data Science
Niko Vuokko
 
PDF
Managing machine learning
David Murgatroyd
 
PDF
Architecting for analytics
Rob Winters
 
PDF
Barga Galvanize Sept 2015
Roger Barga
 
PDF
From Lab to Factory: Creating value with data
Peadar Coyle
 
Disrupting with Data: Lessons from Silicon Valley
Anand Rajaraman
 
Agile bringing Big Data & Analytics closer
Nitin Khattar
 
Datarobot, 자동화된 분석 적용 시 분석 절차의 변화 및 효용 - 홍운표 데이터 사이언티스트, DataRobot :: AWS Sum...
Amazon Web Services Korea
 
Accretive Health - Quality Management in Health Care
AccretiveHealth
 
AIIA - Charting the Path to Intelligent Operations with Machine Learning - At...
BigML, Inc
 
Data Foundation for Analytics Excellence by Tanimura, cathy from Okta
Tin Ho
 
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
UCLA MSBA LinkedIn Industry Seminar 2019-02-27
Jimmy Wong
 
Demystifying ML/AI
Matthew Reynolds
 
Data Driven Engineering 2014
Roger Barga
 
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 
Data Leaders Summit Barcelona 2018
Harvinder Atwal
 
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 
Pragmatic Machine Learning @ ML Spain
Louis Dorard
 
Borys Pratsiuk "How to be NVidia partner"
Lviv Startup Club
 
Industrial Data Science
Niko Vuokko
 
Managing machine learning
David Murgatroyd
 
Architecting for analytics
Rob Winters
 
Barga Galvanize Sept 2015
Roger Barga
 
From Lab to Factory: Creating value with data
Peadar Coyle
 
Ad

More from Randy Shoup (10)

PDF
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
PPTX
Breaking Codes, Designing Jets, and Building Teams
Randy Shoup
 
PPTX
Monoliths, Migrations, and Microservices
Randy Shoup
 
PPTX
Ten Lessons of the DevOps Transition
Randy Shoup
 
PPTX
Managing Data in Microservices
Randy Shoup
 
PPTX
Effective Microservices In a Data-centric World
Randy Shoup
 
PPTX
From the Monolith to Microservices - CraftConf 2015
Randy Shoup
 
PPTX
Concurrency at Scale: Evolution to Micro-Services
Randy Shoup
 
PPTX
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup
 
PPTX
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
Randy Shoup
 
Large Scale Architecture -- The Unreasonable Effectiveness of Simplicity
Randy Shoup
 
Breaking Codes, Designing Jets, and Building Teams
Randy Shoup
 
Monoliths, Migrations, and Microservices
Randy Shoup
 
Ten Lessons of the DevOps Transition
Randy Shoup
 
Managing Data in Microservices
Randy Shoup
 
Effective Microservices In a Data-centric World
Randy Shoup
 
From the Monolith to Microservices - CraftConf 2015
Randy Shoup
 
Concurrency at Scale: Evolution to Micro-Services
Randy Shoup
 
QCon New York 2014 - Scalable, Reliable Analytics Infrastructure at KIXEYE
Randy Shoup
 
QCon Tokyo 2014 - Virtuous Cycles of Velocity: What I Learned About Going Fas...
Randy Shoup
 
Ad

Recently uploaded (20)

PPTX
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
PPTX
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
PDF
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
PPTX
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
DOCX
Import Data Form Excel to Tally Services
Tally xperts
 
PPTX
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
PDF
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
PDF
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
PDF
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
PPTX
Human Resources Information System (HRIS)
Amity University, Patna
 
PDF
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
PPTX
Engineering the Java Web Application (MVC)
abhishekoza1981
 
PDF
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
PDF
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
Executive Business Intelligence Dashboards
vandeslie24
 
PPTX
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 
Writing Better Code - Helping Developers make Decisions.pptx
Lorraine Steyn
 
Feb 2021 Cohesity first pitch presentation.pptx
enginsayin1
 
Efficient, Automated Claims Processing Software for Insurers
Insurance Tech Services
 
How Apagen Empowered an EPC Company with Engineering ERP Software
SatishKumar2651
 
Import Data Form Excel to Tally Services
Tally xperts
 
A Complete Guide to Salesforce SMS Integrations Build Scalable Messaging With...
360 SMS APP
 
Understanding the Need for Systemic Change in Open Source Through Intersectio...
Imma Valls Bernaus
 
MiniTool Partition Wizard 12.8 Crack License Key LATEST
hashhshs786
 
Capcut Pro Crack For PC Latest Version {Fully Unlocked} 2025
hashhshs786
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pdf
Varsha Nayak
 
Human Resources Information System (HRIS)
Amity University, Patna
 
Beyond Binaries: Understanding Diversity and Allyship in a Global Workplace -...
Imma Valls Bernaus
 
Engineering the Java Web Application (MVC)
abhishekoza1981
 
Alarm in Android-Scheduling Timed Tasks Using AlarmManager in Android.pdf
Nabin Dhakal
 
Powering GIS with FME and VertiGIS - Peak of Data & AI 2025
Safe Software
 
Tally software_Introduction_Presentation
AditiBansal54083
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
Executive Business Intelligence Dashboards
vandeslie24
 
Java Native Memory Leaks: The Hidden Villain Behind JVM Performance Issues
Tier1 app
 

An Agile Approach to Machine Learning

  • 4. What problem are you trying to solve?
  • 5. Agree on what you are optimizing
  • 6. Technology @randyshoup • aka “Optimization Function” or “One Metric That Matters” • Discussing and agreeing on this metric is itself valuable • Only very few metrics, preferably one Overall Evaluation Criterion (OEC) • E.g., Actions vs. click rate • E.g., Long-term customer value vs. short-term revenue • “Pirate metrics” (AARRR): Acquisition, Activation, Retention, Revenue, Referral Aligned to Business Value • Validated by data science, not solely chosen by product / business • Look for predictive leading indicators • Avoid lagging indicators and vanity metrics Valid and Measurable Evaluating Success Problem
  • 7. “A problem well-stated is a problem half-solved.” -- Charles Kettering, head of research at GM
  • 10. Technology @randyshoup • Many events, only predictive in aggregate • E.g., web search queries, ecommerce clickstream, Netflix viewing metrics Big but Shallow • Few events, each of which is significant • E.g., ecommerce purchases, WeWork event attendance Small but Deep Characterizing Your Data Data
  • 11. Better data beats a smarter algorithm
  • 12. Technology @randyshoup • Missing data, partial data • Improperly or inconsistently formatted Clean Data • Consolidated into a single (logical) location so it can be processed or analyzed • Joined together (“enriched”) with other data sources Aggregated Data • Tagged by humans with one or more labels • Required to train supervised models • Complicated and expensive at scale Labeled Data Better Data Data
  • 13. Technology @randyshoup • More potentially useful attributes • More data sources • Longer retention More Data • Data pipeline to automate collection and aggregation • Move from large batch to mini-batch to streaming data Timely Data Better Data Data
  • 14. “Data preparation accounts for about 80% of the work of data scientists.” – CrowdFlower survey, 2016 https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#2d58f4ab6f63
  • 16. Technology @randyshoup • Encode expert knowledge • Simple set of imperative if-then-else statements • Brittle and primitive • Surprisingly effective Rules and Heuristics • Regression • Decision trees / forests • Collaborative filtering • May be all you need Simple Algorithms • Iterative Optimization / Dynamic Programming • Neural nets • Deep learning • Only when absolutely required Advanced Techniques Algorithmic Evolution Algorithms
  • 17. Technology @randyshoup • Many real-world problems are best solved through a combination of several algorithms • E.g., Netflix Prize Portfolio / Ensemble Approaches Algorithmic Evolution Algorithms
  • 20. Technology @randyshoup • Many common algorithms are highly accurate, but difficult to interpret • Model can make a decision, but ew cannot “explain” its decision • Particularly important in context of system bias • (+) Decision trees / forests, linear regression • (-) Neural nets, Deep Learning Interpretability / Explainability • Enable data scientists to be self- sufficient in experimenting, building, training, and deploying • End-to-end responsibility for models in production • Write models, deploy models, monitor model performance DevOps for Data Science • Platform-as-a-service for data scientists • Programming model that matches the workflow of a data scientist • Abstract away infrastructure and other details Algorithm Platform Scaling Algorithm Development Algorithms
  • 21. Technology @randyshoup • Data scientists spin up their own resources • Both ad-hoc execution and repeatable pipelines • Data science-friendly programming model exposes ETL and Matrix transforms • Abstracts away storage (S3), computation (Docker and ECS), and the model building pipeline (Spark) Algorithm Platform-as-a-Service Algorithms
  • 23. “It doesn’t matter how beautiful your theory is. It doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong.” -- Richard Feynman
  • 24. Technology @randyshoup • What metrics do you expect to move, and why • Understand your baseline 1. State Your Hypothesis • Sample size based on effect size • Separate control and treatment groups, test for bias • Split traffic between control and treatment 2. Design a Real A|B Test • Understand customer and system behavior • Understand why this experiment worked or did not 3. Obsessively Log and Measure Designing and Running Experimental Discipline
  • 25. Technology @randyshoup • Data trumps hope and intuition • Develop insights for the next experiment 4. Listen to the Data • This is a journey, not a single step 5. Rinse and Repeat Designing and Running Experimental Discipline
  • 26. Technology @randyshoup Listen to the Data Experimental Discipline • 1/3 of ideas were positive and statistically significant • 1/3 of ideas were flat: no statistically significant difference • 1/3 of ideas were negative and statistically significant https://exp-platform.com/experiments-at-microsoft/
  • 27. “Being wrong isn’t a bad thing, like they teach you in school. It is an opportunity to learn something.” -- Richard Feynman
  • 28. Technology @randyshoup • Low-risk, push-button deployment • Rapid release cadence • Rapid rollback and recovery Repeatable Deployment Pipeline • Faster to repair • Easier to understand • Simpler to diagnose Smaller Units of Work • Changes can be rolled out and rolled back • Learnings can be applied in the next experiment Enables Experimentation Continuous Delivery Experimental Discipline
  • 29. Technology @randyshoup • Flag controls whether feature is “on” for a particular set of users • Independently discovered at eBay, Yahoo, Google • Decouple feature delivery from code delivery Enable / Disable feature via configuration • Develop / test / verify in production • Rapid on or off for any reason Makes Speed Safe • Overall experiment controlled by feature flag • Control vs. treatment Enables Experimentation Feature Flags Experimental Discipline
  • 30. ● Ranking function for search results ○ Small number of hand-tuned factors  Thousands of factors ● Incremental Experimentation ○ Predictive models: query->view, view->purchase, etc. ○ Hundreds of parallel A | B tests ○ Full year of steady, incremental improvements  2% increase in eBay revenue (~$120M / year) @randyshoup Machine-Learned Ranking
  • 31. ● Reduce user-experienced latency for search results ● Iterative Process ○ Implement a potential improvement ○ Release to the site in an A | B test ○ Monitor metrics –time to first byte, time to click, click rate, purchase rate  2% increase in eBay revenue (~$120M / year) @randyshoup Site Speed
  • 32. The most dangerous animal is the “HiPPO”
  • 33. Technology 33 Putting it All Together
  • 34. Technology Event Recommendations WeWork Member Experience Member Knowledge Graph Skills and Interests Event Feedback Event Recommender Predictive Model @randyshoup
  • 35. Technology Event Recipes WeWork Member Experience Event Recommender Predictive Model @randyshoup
  • 36. Technology Get the predicted opening occupancy based on the recommended 1-Click price Adjust the price to see how occupancy will change Occupancy Predictor WeWork Revenue Optimization @randyshoup
  • 38. Technology Office Attributes Based Pricing Corner office (premium) Offices with high quality views (premium) Calculate and recommend premium and discounts for key office attributes WeWork Revenue Optimization @randyshoup
  • 39. Technology Example: Recommend alternative usage for unoccupied spaces Fully optimize inventory usage by leveraging demand and profitability predictions Inventory Management WeWork Revenue Optimization @randyshoup
  • 40. Technology Automatically lay out desk configuration given space constraints Automated Layout WeWork Applied Science @randyshoup
  • 42. Technology @randyshoup • Identify and frame a clear business problem • … that matters to customers or the business • Define clear metric(s) for success 1. Drive from Business Needs • Single problem • Solve problem end-to-end • Show business results 2. Start Small • Data collection and storage • Data cleanliness and preparation • Reliable, accurate, timely data pipeline • Better data beats a better model (!) 3. Data Matters Takeaways An Agile Approach to Machine Learning
  • 43. Technology @randyshoup • Start with a Hypothesis • Design an Experiment • Separate Control and Experiment group(s) • Measure business metric for A vs. B • Learn and Decide 4. A | B Testing Discipline • Simple model / No model • Rules and Heuristics • Gradually increase sophistication with more data and more experience 5. Iteratively Refine Model • Find broader applicability across the business • Apply to more and more problems • Move “upstream” in the development process 6. Iteratively Expand Applications Takeaways An Agile Approach to Machine Learning
  • 44. Technology @randyshoup • Make decisions with data instead of guesswork and intuition • Avoid HiPPO decisionmaking • Can be threatening to designers, product managers, decisionmakers 7. Data-Driven Culture • Set of tools in our toolbox • Sometimes valuable and useful • Not a panacea • Not a substitute for thinking  8. Machine Learning is not Magic Takeaways An Agile Approach to Machine Learning
  • 45. Technology New York San Francisco Tel Aviv Shanghai Singapore Seattle Palo Alto Questions? @randyshoup