SlideShare a Scribd company logo
Moving a Fraud-Fighting
Random Forest from scikit-
learn to Spark with ML,
MLflow, and Jupyter
Josh Johnston
Director of AI Science
josh.johnston@kount.com
©Kount Inc All Rights Reserved
Overview
Model lifecycle
Our fraud-detecting model
Initial method with database and scikit learn
Improved method with HDFS and Spark
Robust model governance
©Kount Inc All Rights Reserved
Manage the model lifecycle
Microsoft. (2017, October 19). What is the Team Data Science Process? Retrieved March 26, 2019, from
https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/overview
Modeling
• Configuration management
• Performance (speed)
• Accuracy
• Validation
Governance Questions
• Which model are you using?
• How did you train it?
• How well does it work?
After each answer: Why?
Science is repeatable
Our fraud-detecting
model
©Kount Inc All Rights Reserved
Kount protects digital innovations from…
Fraudulent
Account Creation
Transaction/
Payment Fraud
Account
Takeover Fraud
Authentication
Friction
©Kount Inc All Rights Reserved
Evaluate transactions for fraud
• Substantial throughput
• 30-100 transactions per second
• Low latency
• 250 ms end-to-end system latency
• ~15 ms for machine learning features and model
©Kount Inc All Rights Reserved
Evaluate transactions for fraud
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
Approve an extra ~3K transactions and $1.2M
USD per month
Reduced manual reviews by 200 hours/month
Reduced chargeback rate by 17%
Reduced manual reviews by 20%
Sleep better at night
Don’t hear complaints from fraud team about
review queue anymore
Fraud Manager Feedback:
Boost Technology™ Customer View
©Kount Inc All Rights Reserved
Boost Technology™ Technical View
Feature Engineering
• 200 GB of precomputed data
Model
• Random forest
• 250 trees
• ~100k nodes per tree
• ~1GB serialized representation
Model Training
• ~150 features
• ~60M observations
Initial training with
database and scikit
learn
©Kount Inc All Rights Reserved
First approach gets to production
Analytics
Database
Model Training
Service
Network
Storage
Fetch observations
Fetch lookups
Observation Lookup Flat File Logging
Pickled Model
Train Model
(Scikit Learn)
Time
16 hrs
24 hrs
8 hrs
Lookup compute
1 hr
12 hrs
2.5 days 400GB RAM
1TB into swap
©Kount Inc All Rights Reserved
What works
• Trains a high value model
©Kount Inc All Rights Reserved
What doesn’t work
• Time-intensive
• Errors force restarts since everything is held in memory (and swap)
• Burdens production analytics database
• Pickled model ties execution environment to training environment
• Traceability provided by log files and manual documentation
• Ad hoc experiments with little configuration control
Governance Questions
• Which model are you using?
• How did you train it?
• How well does it work?
After each answer: Why?
Improved training
with HDFS and
Spark
©Kount Inc All Rights Reserved
Cluster for distributed computing
• Dell hardware
• 6 nodes
• 484 vCores
• 1.35 TB RAM
• Cloudera Manager
• Spark 2.4
• Mostly python
HDFS
• Attached to 3 nodes
• 171 TB usable space
©Kount Inc All Rights Reserved
Spark Cluster
Improved approach through cluster
Analytics
Database
HDFSsqoop data
Observation
Lookup
Logging
Zipped MLeap Model
Train Model
(Spark ML)
Time
45 min
2 hrs
8 hrs
Compute lookups
MLflow
Perform lookups
Luigi
<1/2 day
©Kount Inc All Rights Reserved
Remote development with Jupyter
• Most criticisms of notebooks are things you COULD do, not what you
MUST do
• Good development practices are independent of tools
Juptyer Notebook
Pyspark Application
Python Packages
MaturityResearch Production
Version Control (git)
Automation
©Kount Inc All Rights Reserved
What works
• Faster
• Failures restart in the middle
• Reduces burden on production analytics database
• Redesign experiments without penalty
• MLeap decouples evaluation environment from training environment
©Kount Inc All Rights Reserved
What still doesn’t work
• Non-deterministic Spark ML behavior and errors
• Spark pipelines rely on configurations that change based on input data
Tools and Processes
for Model Governance
©Kount Inc All Rights Reserved
Tools and processes for governance
Governance Questions
• Which model are you using?
• How did you train it?
• How well does it work?
After each answer: Why?
Solution components
• Data traceability
• Experiment, configuration, and accuracy traceability
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
©Kount Inc All Rights Reserved
• Data pipelines with error handling
• Repeatable and documented data transformations
• Document parameters
• Trace to code and data used
• Record accuracy of selected and not selected models
• Store final model and configurations as artifact
Governance Questions
• Which model are you using?
• How did you train it?
• How well does it work?
After each answer: Why?
Conclusions
©Kount Inc All Rights Reserved
Kount’s benefits from Spark/HDFS, Luigi, and MLflow
• Faster
• Failures can restart in the middle
• Reduce burden on production analytics database
• Redesign experiments without penalty
• MLeap decouples evaluation environment from training environment
Governance Questions
• Which model are you using?
• How did you train it?
• How well does it work?
After each answer: Why?
Moving a Fraud-Fighting
Random Forest from scikit-
learn to Spark with ML,
MLflow, and Jupyter
Josh Johnston
Director of AI Science
josh.johnston@kount.com

More Related Content

What's hot (20)

PDF
MySQL 상태 메시지 분석 및 활용
I Goo Lee
 
PPTX
Oracle GoldenGate 21c New Features and Best Practices
Bobby Curtis
 
PDF
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
HostedbyConfluent
 
PPT
Step-by-Step Introduction to Apache Flink
Slim Baltagi
 
PDF
A Deep Dive into Kafka Controller
confluent
 
PDF
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...
Henning Jacobs
 
PDF
MongoDB vs. Postgres Benchmarks
EDB
 
PDF
Migration From Oracle to PostgreSQL
PGConf APAC
 
PDF
Data Modeling in Looker
Looker
 
PDF
Understanding Query Plans and Spark UIs
Databricks
 
PDF
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
PDF
The Oracle RAC Family of Solutions - Presentation
Markus Michalewicz
 
PPTX
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
PPTX
HBase Low Latency
DataWorks Summit
 
PDF
An Introduction to Apache Kafka
Amir Sedighi
 
PPTX
Kafka replication apachecon_2013
Jun Rao
 
PPTX
Hoodie: Incremental processing on hadoop
Prasanna Rajaperumal
 
PPTX
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 
PDF
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
PPT
Sql Server Performance Tuning
Bala Subra
 
MySQL 상태 메시지 분석 및 활용
I Goo Lee
 
Oracle GoldenGate 21c New Features and Best Practices
Bobby Curtis
 
Capacity Planning Your Kafka Cluster | Jason Bell, Digitalis
HostedbyConfluent
 
Step-by-Step Introduction to Apache Flink
Slim Baltagi
 
A Deep Dive into Kafka Controller
confluent
 
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...
Henning Jacobs
 
MongoDB vs. Postgres Benchmarks
EDB
 
Migration From Oracle to PostgreSQL
PGConf APAC
 
Data Modeling in Looker
Looker
 
Understanding Query Plans and Spark UIs
Databricks
 
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
 
The Oracle RAC Family of Solutions - Presentation
Markus Michalewicz
 
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
HBase Low Latency
DataWorks Summit
 
An Introduction to Apache Kafka
Amir Sedighi
 
Kafka replication apachecon_2013
Jun Rao
 
Hoodie: Incremental processing on hadoop
Prasanna Rajaperumal
 
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Databricks
 
Sql Server Performance Tuning
Bala Subra
 

Similar to Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, MLflow, and Jupyter (20)

PDF
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
PDF
Productionising Machine Learning Models
Tash Bickley
 
PDF
Challenges of Operationalising Data Science in Production
iguazio
 
PDF
Ideas spracklen-final
supportlogic
 
PPTX
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
PDF
Network Automation Journey, A systems engineer NetOps perspective
Walid Shaari
 
PDF
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
 
PPTX
SF Architect Interview questions v1.3.pptx
AnkitJain429819
 
PPTX
AI hype or reality
Awantik Das
 
PPT
Dr. Jim Murray: How do we Protect our Systems and Meet Compliance in a Rapidl...
Government Technology and Services Coalition
 
PDF
Machine Learning Operations Cababilities
davidsh11
 
PPTX
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
PDF
SparkML: Easy ML Productization for Real-Time Bidding
Databricks
 
PDF
The Diabolical Developers Guide to Performance Tuning
jClarity
 
PPTX
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
PPTX
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
PDF
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Rachel Bland
 
PDF
Machine Learning Infrastructure
SigOpt
 
PPTX
Moving from BI to AI : For decision makers
zekeLabs Technologies
 
PPT
motorized bike j2ee ppt explanation of project
prabhat kumar
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Databricks
 
Productionising Machine Learning Models
Tash Bickley
 
Challenges of Operationalising Data Science in Production
iguazio
 
Ideas spracklen-final
supportlogic
 
Shikha fdp 62_14july2017
Dr. Shikha Mehta
 
Network Automation Journey, A systems engineer NetOps perspective
Walid Shaari
 
AI as a Service, Build Shared AI Service Platforms Based on Deep Learning Tec...
Databricks
 
SF Architect Interview questions v1.3.pptx
AnkitJain429819
 
AI hype or reality
Awantik Das
 
Dr. Jim Murray: How do we Protect our Systems and Meet Compliance in a Rapidl...
Government Technology and Services Coalition
 
Machine Learning Operations Cababilities
davidsh11
 
Machine learning at scale - Webinar By zekeLabs
zekeLabs Technologies
 
SparkML: Easy ML Productization for Real-Time Bidding
Databricks
 
The Diabolical Developers Guide to Performance Tuning
jClarity
 
Open, Secure & Transparent AI Pipelines
Nick Pentreath
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Yong Feng
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Rachel Bland
 
Machine Learning Infrastructure
SigOpt
 
Moving from BI to AI : For decision makers
zekeLabs Technologies
 
motorized bike j2ee ppt explanation of project
prabhat kumar
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PDF
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
DOCX
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
PPTX
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
PDF
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
PDF
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
PPTX
Data anlytics Hospitals Research India.pptx
SayantanChakravorty2
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PDF
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
PPTX
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
PDF
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
PDF
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
PPTX
covid 19 data analysis updates in our municipality
RhuAyungon1
 
PDF
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
PPTX
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
PPTX
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
PPTX
Krezentios memories in college data.pptx
notknown9
 
PPTX
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
PPTX
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
PPTX
在线购买英国本科毕业证苏格兰皇家音乐学院水印成绩单RSAMD学费发票
Taqyea
 
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
01_Nico Vincent_Sailpeak.pptx_AI_Barometer_2025
FinTech Belgium
 
A GraphRAG approach for Energy Efficiency Q&A
Marco Brambilla
 
apidays Singapore 2025 - The API Playbook for AI by Shin Wee Chuang (PAND AI)
apidays
 
Data anlytics Hospitals Research India.pptx
SayantanChakravorty2
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
Technical-Report-GPS_GIS_RS-for-MSF-finalv2.pdf
KPycho
 
Module-2_3-1eentzyssssssssssssssssssssss.pptx
ShahidHussain66691
 
UNISE-Operation-Procedure-InDHIS2trainng
ahmedabduselam23
 
apidays Singapore 2025 - Surviving an interconnected world with API governanc...
apidays
 
covid 19 data analysis updates in our municipality
RhuAyungon1
 
apidays Singapore 2025 - How APIs can make - or break - trust in your AI by S...
apidays
 
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
05_Jelle Baats_Tekst.pptx_AI_Barometer_Release_Event
FinTech Belgium
 
Krezentios memories in college data.pptx
notknown9
 
Feb 2021 Ransomware Recovery presentation.pptx
enginsayin1
 
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
在线购买英国本科毕业证苏格兰皇家音乐学院水印成绩单RSAMD学费发票
Taqyea
 

Moving a Fraud-Fighting Random Forest from scikit-learn to Spark with MLlib, MLflow, and Jupyter

  • 1. Moving a Fraud-Fighting Random Forest from scikit- learn to Spark with ML, MLflow, and Jupyter Josh Johnston Director of AI Science [email protected]
  • 2. ©Kount Inc All Rights Reserved Overview Model lifecycle Our fraud-detecting model Initial method with database and scikit learn Improved method with HDFS and Spark Robust model governance
  • 3. ©Kount Inc All Rights Reserved Manage the model lifecycle Microsoft. (2017, October 19). What is the Team Data Science Process? Retrieved March 26, 2019, from https://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/overview Modeling • Configuration management • Performance (speed) • Accuracy • Validation Governance Questions • Which model are you using? • How did you train it? • How well does it work? After each answer: Why? Science is repeatable
  • 5. ©Kount Inc All Rights Reserved Kount protects digital innovations from… Fraudulent Account Creation Transaction/ Payment Fraud Account Takeover Fraud Authentication Friction
  • 6. ©Kount Inc All Rights Reserved Evaluate transactions for fraud • Substantial throughput • 30-100 transactions per second • Low latency • 250 ms end-to-end system latency • ~15 ms for machine learning features and model
  • 7. ©Kount Inc All Rights Reserved Evaluate transactions for fraud
  • 8. ©Kount Inc All Rights Reserved
  • 9. ©Kount Inc All Rights Reserved Approve an extra ~3K transactions and $1.2M USD per month Reduced manual reviews by 200 hours/month Reduced chargeback rate by 17% Reduced manual reviews by 20% Sleep better at night Don’t hear complaints from fraud team about review queue anymore Fraud Manager Feedback: Boost Technology™ Customer View
  • 10. ©Kount Inc All Rights Reserved Boost Technology™ Technical View Feature Engineering • 200 GB of precomputed data Model • Random forest • 250 trees • ~100k nodes per tree • ~1GB serialized representation Model Training • ~150 features • ~60M observations
  • 11. Initial training with database and scikit learn
  • 12. ©Kount Inc All Rights Reserved First approach gets to production Analytics Database Model Training Service Network Storage Fetch observations Fetch lookups Observation Lookup Flat File Logging Pickled Model Train Model (Scikit Learn) Time 16 hrs 24 hrs 8 hrs Lookup compute 1 hr 12 hrs 2.5 days 400GB RAM 1TB into swap
  • 13. ©Kount Inc All Rights Reserved What works • Trains a high value model
  • 14. ©Kount Inc All Rights Reserved What doesn’t work • Time-intensive • Errors force restarts since everything is held in memory (and swap) • Burdens production analytics database • Pickled model ties execution environment to training environment • Traceability provided by log files and manual documentation • Ad hoc experiments with little configuration control Governance Questions • Which model are you using? • How did you train it? • How well does it work? After each answer: Why?
  • 16. ©Kount Inc All Rights Reserved Cluster for distributed computing • Dell hardware • 6 nodes • 484 vCores • 1.35 TB RAM • Cloudera Manager • Spark 2.4 • Mostly python HDFS • Attached to 3 nodes • 171 TB usable space
  • 17. ©Kount Inc All Rights Reserved Spark Cluster Improved approach through cluster Analytics Database HDFSsqoop data Observation Lookup Logging Zipped MLeap Model Train Model (Spark ML) Time 45 min 2 hrs 8 hrs Compute lookups MLflow Perform lookups Luigi <1/2 day
  • 18. ©Kount Inc All Rights Reserved Remote development with Jupyter • Most criticisms of notebooks are things you COULD do, not what you MUST do • Good development practices are independent of tools Juptyer Notebook Pyspark Application Python Packages MaturityResearch Production Version Control (git) Automation
  • 19. ©Kount Inc All Rights Reserved What works • Faster • Failures restart in the middle • Reduces burden on production analytics database • Redesign experiments without penalty • MLeap decouples evaluation environment from training environment
  • 20. ©Kount Inc All Rights Reserved What still doesn’t work • Non-deterministic Spark ML behavior and errors • Spark pipelines rely on configurations that change based on input data
  • 21. Tools and Processes for Model Governance
  • 22. ©Kount Inc All Rights Reserved Tools and processes for governance Governance Questions • Which model are you using? • How did you train it? • How well does it work? After each answer: Why? Solution components • Data traceability • Experiment, configuration, and accuracy traceability
  • 23. ©Kount Inc All Rights Reserved
  • 24. ©Kount Inc All Rights Reserved
  • 25. ©Kount Inc All Rights Reserved
  • 26. ©Kount Inc All Rights Reserved
  • 27. ©Kount Inc All Rights Reserved
  • 28. ©Kount Inc All Rights Reserved
  • 29. ©Kount Inc All Rights Reserved • Data pipelines with error handling • Repeatable and documented data transformations • Document parameters • Trace to code and data used • Record accuracy of selected and not selected models • Store final model and configurations as artifact Governance Questions • Which model are you using? • How did you train it? • How well does it work? After each answer: Why?
  • 31. ©Kount Inc All Rights Reserved Kount’s benefits from Spark/HDFS, Luigi, and MLflow • Faster • Failures can restart in the middle • Reduce burden on production analytics database • Redesign experiments without penalty • MLeap decouples evaluation environment from training environment Governance Questions • Which model are you using? • How did you train it? • How well does it work? After each answer: Why?
  • 32. Moving a Fraud-Fighting Random Forest from scikit- learn to Spark with ML, MLflow, and Jupyter Josh Johnston Director of AI Science [email protected]