SlideShare a Scribd company logo
Creating Your First
Predictive Model In Python
Coming November 2015!
Information Everywhere
Makes Panda sad and confused
Each New Thing You Learn
Leads to another new thing to learn, and another…
So Many Things
1. Which predictive modeling technique to use
2. How to get the data into a format for modeling
3. How to ensure the “right” data is being used
4. How to feed the data into the model
5. How to validate the model results
6. How to save the model to use in production
7. How to implement the model in production and apply it to new observations
8. How to save the new predictions
9. How to ensure, over time, that the model is correctly predicting outcomes
10.How to later update the model with new training data
Choose Your Model
http://scikit-learn.org/stable/tutorial/machine_learning_map/
Format The Data
• Pandas FTW!
• Use the map() function to convert any text to a
number
• Fill in any missing values
• Split the data into features (the data) and targets
(the outcome to predict) using .values on the
DataFrame
Get The Right Data
• This is called “Feature selection”
• Univariate feature selection
• SelectKBest removes all but the k highest scoring features
• SelectPercentile removes all but a user-specified highest scoring
percentage of features using common univariate statistical tests for
each feature: false positive rate
• SelectFpr, false discovery rate SelectFdr, or family wise error SelectFwe.
• GenericUnivariateSelect allows to perform univariate feature selection
with a configurable strategy.
http://scikit-learn.org/stable/modules/feature_selection.html
Data => Model
1. Build the model
http://scikit-learn.org/stable/modules/cross_validation.html
from sklearn import linear_model
logClassifier = linear_model.LogisticRegression(C=1,
random_state=111)
2. Train the model
from sklearn import cross_validation
X_train, X_test, y_train, y_test = cross_validation.train_test_split(the_data,
the_targets,
cv=12,
test_size=0.20,
random_state=111)
logClassifier.fit(X_train, y_train)
Validation!
1. Accuracy Score
http://scikit-learn.org/stable/modules/cross_validation.html
from sklearn import metrics
metrics.accuracy_score(y_test, predicted)
2. Confusion Matrix
metrics.confusion_matrix(y_test, predicted)
Save the Model
Pickle it!
https://docs.python.org/3/library/pickle.html
import pickle
model_file = "/lr_classifier_09.29.15.dat"
pickle.dump(logClassifier, open(model_file, "wb"))
Did it work?
logClassifier2 = pickle.load(open(model, "rb"))
print(logClassifier2)
Implement in Production
• Clean the data the same way you did for the model
• Feature mappings
• Column re-ordering
• Create a function that returns the prediction
• Deserialize the model from the file you created
• Feed the model the data in the same order
• Call .predict() and get your answer
Save Your Predictions
As you would any other piece of data
Ensure Accuracy Over Time
Employ your minion army, or get more creative
Update the Model
Train it again, but with validated predictions
Coming November 2015!
Robert Dempsey
robertwdempsey
rdempsey
rdempsey
robertwdempsey.com
Ad

More Related Content

What's hot (19)

Automatic image moderation in classifieds
Automatic image moderation in classifiedsAutomatic image moderation in classifieds
Automatic image moderation in classifieds
Jaroslaw Szymczak
 
machine learning
machine learningmachine learning
machine learning
Mounisha A
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
DotNetCampus
 
BigML Education - Anomaly Detection
BigML Education - Anomaly DetectionBigML Education - Anomaly Detection
BigML Education - Anomaly Detection
BigML, Inc
 
QCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for EveryoneQCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for Everyone
Dhiana Deva
 
Santander customer satisfaction
Santander customer satisfactionSantander customer satisfaction
Santander customer satisfaction
Aprameya Bhol
 
BigML Education - Logistic Regression
BigML Education - Logistic RegressionBigML Education - Logistic Regression
BigML Education - Logistic Regression
BigML, Inc
 
Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...
Venkat Projects
 
RapidMiner: Nested Subprocesses
RapidMiner:   Nested SubprocessesRapidMiner:   Nested Subprocesses
RapidMiner: Nested Subprocesses
DataminingTools Inc
 
Zoo information system presentation
Zoo information system presentationZoo information system presentation
Zoo information system presentation
MiltonGZalduondo1
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
Ted Xiao
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
Hayim Makabee
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoost
Joonyoung Yi
 
EuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scaleEuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scale
Rebecca Bilbro
 
How to understand and implement regression analysis
How to understand and implement regression analysisHow to understand and implement regression analysis
How to understand and implement regression analysis
ClaireWhittaker5
 
20 Simple CART
20 Simple CART20 Simple CART
20 Simple CART
Vishal Dutt
 
Incheon National University - EATED SRA
Incheon National University - EATED SRAIncheon National University - EATED SRA
Incheon National University - EATED SRA
ssuser58d6dc2
 
Tutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Tutorial 4 how to edit the unsafe control actions of stpa project in xstamppTutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Tutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Asim Abdulkhaleq, Dr.rer.nat
 
RapidMiner: Advanced Processes And Operators
RapidMiner:  Advanced Processes And OperatorsRapidMiner:  Advanced Processes And Operators
RapidMiner: Advanced Processes And Operators
DataminingTools Inc
 
Automatic image moderation in classifieds
Automatic image moderation in classifiedsAutomatic image moderation in classifieds
Automatic image moderation in classifieds
Jaroslaw Szymczak
 
machine learning
machine learningmachine learning
machine learning
Mounisha A
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
DotNetCampus
 
BigML Education - Anomaly Detection
BigML Education - Anomaly DetectionBigML Education - Anomaly Detection
BigML Education - Anomaly Detection
BigML, Inc
 
QCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for EveryoneQCon Rio - Machine Learning for Everyone
QCon Rio - Machine Learning for Everyone
Dhiana Deva
 
Santander customer satisfaction
Santander customer satisfactionSantander customer satisfaction
Santander customer satisfaction
Aprameya Bhol
 
BigML Education - Logistic Regression
BigML Education - Logistic RegressionBigML Education - Logistic Regression
BigML Education - Logistic Regression
BigML, Inc
 
Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...
Venkat Projects
 
Zoo information system presentation
Zoo information system presentationZoo information system presentation
Zoo information system presentation
MiltonGZalduondo1
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
Ted Xiao
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
Hayim Makabee
 
Introduction to XGBoost
Introduction to XGBoostIntroduction to XGBoost
Introduction to XGBoost
Joonyoung Yi
 
EuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scaleEuroSciPy 2019: Visual diagnostics at scale
EuroSciPy 2019: Visual diagnostics at scale
Rebecca Bilbro
 
How to understand and implement regression analysis
How to understand and implement regression analysisHow to understand and implement regression analysis
How to understand and implement regression analysis
ClaireWhittaker5
 
Incheon National University - EATED SRA
Incheon National University - EATED SRAIncheon National University - EATED SRA
Incheon National University - EATED SRA
ssuser58d6dc2
 
Tutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Tutorial 4 how to edit the unsafe control actions of stpa project in xstamppTutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Tutorial 4 how to edit the unsafe control actions of stpa project in xstampp
Asim Abdulkhaleq, Dr.rer.nat
 
RapidMiner: Advanced Processes And Operators
RapidMiner:  Advanced Processes And OperatorsRapidMiner:  Advanced Processes And Operators
RapidMiner: Advanced Processes And Operators
DataminingTools Inc
 

Similar to Creating Your First Predictive Model In Python (20)

Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
Renjith M P
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
DotNetCampus
 
11 ta dts2021-11-v2
11 ta dts2021-11-v211 ta dts2021-11-v2
11 ta dts2021-11-v2
ArdianDwiPraba
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
antimo musone
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
Panagiotis Papaemmanouil
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
Alex Austin
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
mylittleadventure
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
Gülden Bilgütay
 
How To Build Auto-Adaptive Machine Learning Models with Kubernetes
How To Build Auto-Adaptive Machine Learning Models with KubernetesHow To Build Auto-Adaptive Machine Learning Models with Kubernetes
How To Build Auto-Adaptive Machine Learning Models with Kubernetes
cnvrg.io AI OS - Hands-on ML Workshops
 
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
IRJET Journal
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
Avinash Patil
 
Azure Machine Learning Challenge_Speakers Presentation.pptx
Azure Machine Learning Challenge_Speakers Presentation.pptxAzure Machine Learning Challenge_Speakers Presentation.pptx
Azure Machine Learning Challenge_Speakers Presentation.pptx
DrSatwinderSingh3
 
OpenML 2019
OpenML 2019OpenML 2019
OpenML 2019
Joaquin Vanschoren
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
Valéry BERNARD
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
Davis David
 
construire modele machine_Learning.pptx
construire modele  machine_Learning.pptxconstruire modele  machine_Learning.pptx
construire modele machine_Learning.pptx
koooragoal20000
 
Hands-on - Machine Learning using scikitLearn
Hands-on - Machine Learning using scikitLearnHands-on - Machine Learning using scikitLearn
Hands-on - Machine Learning using scikitLearn
avrtraining021
 
Machine Learning: Transforming Data into Insights
Machine Learning: Transforming Data into InsightsMachine Learning: Transforming Data into Insights
Machine Learning: Transforming Data into Insights
pemac73062
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
Final
FinalFinal
Final
Mordechai Ben-Zecharia
 
Start machine learning in 5 simple steps
Start machine learning in 5 simple stepsStart machine learning in 5 simple steps
Start machine learning in 5 simple steps
Renjith M P
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
DotNetCampus
 
Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015 Azure Machine Learning Dotnet Campus 2015
Azure Machine Learning Dotnet Campus 2015
antimo musone
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
Panagiotis Papaemmanouil
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
Alex Austin
 
Introduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventureIntroduction Machine Learning by MyLittleAdventure
Introduction Machine Learning by MyLittleAdventure
mylittleadventure
 
Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21Machine Learning for .NET Developers - ADC21
Machine Learning for .NET Developers - ADC21
Gülden Bilgütay
 
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
STOCK PRICE PREDICTION USING MACHINE LEARNING [RANDOM FOREST REGRESSION MODEL]
IRJET Journal
 
Ml ops intro session
Ml ops   intro sessionMl ops   intro session
Ml ops intro session
Avinash Patil
 
Azure Machine Learning Challenge_Speakers Presentation.pptx
Azure Machine Learning Challenge_Speakers Presentation.pptxAzure Machine Learning Challenge_Speakers Presentation.pptx
Azure Machine Learning Challenge_Speakers Presentation.pptx
DrSatwinderSingh3
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
Valéry BERNARD
 
Feature enginnering and selection
Feature enginnering and selectionFeature enginnering and selection
Feature enginnering and selection
Davis David
 
construire modele machine_Learning.pptx
construire modele  machine_Learning.pptxconstruire modele  machine_Learning.pptx
construire modele machine_Learning.pptx
koooragoal20000
 
Hands-on - Machine Learning using scikitLearn
Hands-on - Machine Learning using scikitLearnHands-on - Machine Learning using scikitLearn
Hands-on - Machine Learning using scikitLearn
avrtraining021
 
Machine Learning: Transforming Data into Insights
Machine Learning: Transforming Data into InsightsMachine Learning: Transforming Data into Insights
Machine Learning: Transforming Data into Insights
pemac73062
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Databricks
 
Ad

More from Robert Dempsey (20)

Building A Production-Level Machine Learning Pipeline
Building A Production-Level Machine Learning PipelineBuilding A Production-Level Machine Learning Pipeline
Building A Production-Level Machine Learning Pipeline
Robert Dempsey
 
Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of Data
Robert Dempsey
 
Analyzing Semi-Structured Data At Volume In The Cloud
Analyzing Semi-Structured Data At Volume In The CloudAnalyzing Semi-Structured Data At Volume In The Cloud
Analyzing Semi-Structured Data At Volume In The Cloud
Robert Dempsey
 
Growth Hacking 101
Growth Hacking 101Growth Hacking 101
Growth Hacking 101
Robert Dempsey
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
Robert Dempsey
 
DC Python Intro Slides - Rob's Version
DC Python Intro Slides - Rob's VersionDC Python Intro Slides - Rob's Version
DC Python Intro Slides - Rob's Version
Robert Dempsey
 
Content Marketing Strategy for 2013
Content Marketing Strategy for 2013Content Marketing Strategy for 2013
Content Marketing Strategy for 2013
Robert Dempsey
 
Creating Lead-Generating Social Media Campaigns
Creating Lead-Generating Social Media CampaignsCreating Lead-Generating Social Media Campaigns
Creating Lead-Generating Social Media Campaigns
Robert Dempsey
 
Goal Writing Workshop
Goal Writing WorkshopGoal Writing Workshop
Goal Writing Workshop
Robert Dempsey
 
Google AdWords Introduction
Google AdWords IntroductionGoogle AdWords Introduction
Google AdWords Introduction
Robert Dempsey
 
20 Tips For Freelance Success
20 Tips For Freelance Success20 Tips For Freelance Success
20 Tips For Freelance Success
Robert Dempsey
 
How To Turn Your Business Into A Media Powerhouse
How To Turn Your Business Into A Media PowerhouseHow To Turn Your Business Into A Media Powerhouse
How To Turn Your Business Into A Media Powerhouse
Robert Dempsey
 
Agile Teams as Innovation Teams
Agile Teams as Innovation TeamsAgile Teams as Innovation Teams
Agile Teams as Innovation Teams
Robert Dempsey
 
Introduction to kanban
Introduction to kanbanIntroduction to kanban
Introduction to kanban
Robert Dempsey
 
Get The **** Up And Market
Get The **** Up And MarketGet The **** Up And Market
Get The **** Up And Market
Robert Dempsey
 
Introduction To Inbound Marketing
Introduction To Inbound MarketingIntroduction To Inbound Marketing
Introduction To Inbound Marketing
Robert Dempsey
 
Writing Agile Requirements
Writing  Agile  RequirementsWriting  Agile  Requirements
Writing Agile Requirements
Robert Dempsey
 
Twitter For Business
Twitter For BusinessTwitter For Business
Twitter For Business
Robert Dempsey
 
Introduction To Scrum For Managers
Introduction To Scrum For ManagersIntroduction To Scrum For Managers
Introduction To Scrum For Managers
Robert Dempsey
 
Introduction to Agile for Managers
Introduction to Agile for ManagersIntroduction to Agile for Managers
Introduction to Agile for Managers
Robert Dempsey
 
Building A Production-Level Machine Learning Pipeline
Building A Production-Level Machine Learning PipelineBuilding A Production-Level Machine Learning Pipeline
Building A Production-Level Machine Learning Pipeline
Robert Dempsey
 
Using PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of DataUsing PySpark to Process Boat Loads of Data
Using PySpark to Process Boat Loads of Data
Robert Dempsey
 
Analyzing Semi-Structured Data At Volume In The Cloud
Analyzing Semi-Structured Data At Volume In The CloudAnalyzing Semi-Structured Data At Volume In The Cloud
Analyzing Semi-Structured Data At Volume In The Cloud
Robert Dempsey
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
Robert Dempsey
 
DC Python Intro Slides - Rob's Version
DC Python Intro Slides - Rob's VersionDC Python Intro Slides - Rob's Version
DC Python Intro Slides - Rob's Version
Robert Dempsey
 
Content Marketing Strategy for 2013
Content Marketing Strategy for 2013Content Marketing Strategy for 2013
Content Marketing Strategy for 2013
Robert Dempsey
 
Creating Lead-Generating Social Media Campaigns
Creating Lead-Generating Social Media CampaignsCreating Lead-Generating Social Media Campaigns
Creating Lead-Generating Social Media Campaigns
Robert Dempsey
 
Google AdWords Introduction
Google AdWords IntroductionGoogle AdWords Introduction
Google AdWords Introduction
Robert Dempsey
 
20 Tips For Freelance Success
20 Tips For Freelance Success20 Tips For Freelance Success
20 Tips For Freelance Success
Robert Dempsey
 
How To Turn Your Business Into A Media Powerhouse
How To Turn Your Business Into A Media PowerhouseHow To Turn Your Business Into A Media Powerhouse
How To Turn Your Business Into A Media Powerhouse
Robert Dempsey
 
Agile Teams as Innovation Teams
Agile Teams as Innovation TeamsAgile Teams as Innovation Teams
Agile Teams as Innovation Teams
Robert Dempsey
 
Introduction to kanban
Introduction to kanbanIntroduction to kanban
Introduction to kanban
Robert Dempsey
 
Get The **** Up And Market
Get The **** Up And MarketGet The **** Up And Market
Get The **** Up And Market
Robert Dempsey
 
Introduction To Inbound Marketing
Introduction To Inbound MarketingIntroduction To Inbound Marketing
Introduction To Inbound Marketing
Robert Dempsey
 
Writing Agile Requirements
Writing  Agile  RequirementsWriting  Agile  Requirements
Writing Agile Requirements
Robert Dempsey
 
Introduction To Scrum For Managers
Introduction To Scrum For ManagersIntroduction To Scrum For Managers
Introduction To Scrum For Managers
Robert Dempsey
 
Introduction to Agile for Managers
Introduction to Agile for ManagersIntroduction to Agile for Managers
Introduction to Agile for Managers
Robert Dempsey
 
Ad

Recently uploaded (20)

WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)WinRAR Crack for Windows (100% Working 2025)
WinRAR Crack for Windows (100% Working 2025)
sh607827
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025Adobe Lightroom Classic Crack FREE Latest link 2025
Adobe Lightroom Classic Crack FREE Latest link 2025
kashifyounis067
 
Revolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptxRevolutionizing Residential Wi-Fi PPT.pptx
Revolutionizing Residential Wi-Fi PPT.pptx
nidhisingh691197
 
Societal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainabilitySocietal challenges of AI: biases, multilinguism and sustainability
Societal challenges of AI: biases, multilinguism and sustainability
Jordi Cabot
 
The Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdfThe Significance of Hardware in Information Systems.pdf
The Significance of Hardware in Information Systems.pdf
drewplanas10
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
How to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud PerformanceHow to Optimize Your AWS Environment for Improved Cloud Performance
How to Optimize Your AWS Environment for Improved Cloud Performance
ThousandEyes
 
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
F-Secure Freedome VPN 2025 Crack Plus Activation  New VersionF-Secure Freedome VPN 2025 Crack Plus Activation  New Version
F-Secure Freedome VPN 2025 Crack Plus Activation New Version
saimabibi60507
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Not So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java WebinarNot So Common Memory Leaks in Java Webinar
Not So Common Memory Leaks in Java Webinar
Tier1 app
 
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025Why Orangescrum Is a Game Changer for Construction Companies in 2025
Why Orangescrum Is a Game Changer for Construction Companies in 2025
Orangescrum
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsAdobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards
BradBedford3
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 

Creating Your First Predictive Model In Python

  • 4. Each New Thing You Learn Leads to another new thing to learn, and another…
  • 5. So Many Things 1. Which predictive modeling technique to use 2. How to get the data into a format for modeling 3. How to ensure the “right” data is being used 4. How to feed the data into the model 5. How to validate the model results 6. How to save the model to use in production 7. How to implement the model in production and apply it to new observations 8. How to save the new predictions 9. How to ensure, over time, that the model is correctly predicting outcomes 10.How to later update the model with new training data
  • 7. Format The Data • Pandas FTW! • Use the map() function to convert any text to a number • Fill in any missing values • Split the data into features (the data) and targets (the outcome to predict) using .values on the DataFrame
  • 8. Get The Right Data • This is called “Feature selection” • Univariate feature selection • SelectKBest removes all but the k highest scoring features • SelectPercentile removes all but a user-specified highest scoring percentage of features using common univariate statistical tests for each feature: false positive rate • SelectFpr, false discovery rate SelectFdr, or family wise error SelectFwe. • GenericUnivariateSelect allows to perform univariate feature selection with a configurable strategy. http://scikit-learn.org/stable/modules/feature_selection.html
  • 9. Data => Model 1. Build the model http://scikit-learn.org/stable/modules/cross_validation.html from sklearn import linear_model logClassifier = linear_model.LogisticRegression(C=1, random_state=111) 2. Train the model from sklearn import cross_validation X_train, X_test, y_train, y_test = cross_validation.train_test_split(the_data, the_targets, cv=12, test_size=0.20, random_state=111) logClassifier.fit(X_train, y_train)
  • 10. Validation! 1. Accuracy Score http://scikit-learn.org/stable/modules/cross_validation.html from sklearn import metrics metrics.accuracy_score(y_test, predicted) 2. Confusion Matrix metrics.confusion_matrix(y_test, predicted)
  • 11. Save the Model Pickle it! https://docs.python.org/3/library/pickle.html import pickle model_file = "/lr_classifier_09.29.15.dat" pickle.dump(logClassifier, open(model_file, "wb")) Did it work? logClassifier2 = pickle.load(open(model, "rb")) print(logClassifier2)
  • 12. Implement in Production • Clean the data the same way you did for the model • Feature mappings • Column re-ordering • Create a function that returns the prediction • Deserialize the model from the file you created • Feed the model the data in the same order • Call .predict() and get your answer
  • 13. Save Your Predictions As you would any other piece of data
  • 14. Ensure Accuracy Over Time Employ your minion army, or get more creative
  • 15. Update the Model Train it again, but with validated predictions