100% found this document useful (1 vote)

93 views

6 XG Boost - Jupyter Notebook

The document shows code for loading and preparing a customer churn dataset from CSV for modeling with XGBoost. It loads the data, splits it into training and test sets, encodes categorical features, transforms the data, fits an XGBoost classifier to the training set, makes predictions on the test set, and evaluates the model with a confusion matrix and accuracy score.

Uploaded by

venkatesh m

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

93 views

6 XG Boost - Jupyter Notebook

Uploaded by

venkatesh m

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

In

[1]: import numpy as np

import matplotlib.pyplot as plt
import pandas as pd

In [2]: dataset = pd.read_csv("D:\\Course\\Python\\Datasets\\Churn_Modelling.csv")

In [3]: dataset

Out[3]: RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balanc

0 1 15634602 Hargrave 619 France Female 42 2 0.0

1 2 15647311 Hill 608 Spain Female 41 1 83807.8

2 3 15619304 Onio 502 France Female 42 8 159660.8

3 4 15701354 Boni 699 France Female 39 1 0.0

4 5 15737888 Mitchell 850 Spain Female 43 2 125510.8

... ... ... ... ... ... ... ... ...

9995 9996 15606229 Obijiaku 771 France Male 39 5 0.0

9996 9997 15569892 Johnstone 516 France Male 35 10 57369.6

9997 9998 15584532 Liu 709 France Female 36 7 0.0

9998 9999 15682355 Sabbatini 772 Germany Male 42 3 75075.3

9999 10000 15628319 Walker 792 France Female 28 4 130142.7

10000 rows × 14 columns

In [4]: X = dataset.iloc[:, 3:13].values

X
y = dataset.iloc[:, 13].values
y

...

In [5]: X

test1 = pd.DataFrame(X)
test1

...
In [6]: from sklearn.preprocessing import LabelEncoder, OneHotEncoder

# Converting the categorical data into Number (0 ,1 ,2)

labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])

labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])

In [7]: X
test2 = pd.DataFrame(X)
test2
...

In [ ]: # Creating 3 dummy varabiles for country ( Factor level of 3 spain , France and G

#onehotencoder = OneHotEncoder()
#X = onehotencoder.fit_transform(X).toarray()
#X = X[:, 1:]

#onehotencoder = OneHotEncoder()
#X = onehotencoder.fit_transform(X).toarray()
#X = X[:, 1:]

In [8]: from sklearn.compose import ColumnTransformer

ct = ColumnTransformer([("Geography", OneHotEncoder(), [1])], remainder = 'passth

X = ct.fit_transform(X)

C:\Users\rgandyala\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn
\preprocessing\_encoders.py:415: FutureWarning: The handling of integer data wi
ll change in version 0.22. Currently, the categories are determined based on th
e range [0, max(values)], while in the future they will be determined based on
the unique values.

If you want the future behaviour and silence this warning, you can specify "cat
egories='auto'".

In case you used a LabelEncoder before this OneHotEncoder to convert the catego
ries to integers, then you can now use the OneHotEncoder directly.

warnings.warn(msg, FutureWarning)

In [9]: abc=pd.DataFrame(X)
abc

...
In [10]: from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

In [2]:
from xgboost.sklearn import XGBClassifier

classifier = XGBClassifier()

In [13]: classifier.fit(X_train,y_train)

...

In [14]: y_pred = classifier.predict(X_test)

In [15]: y_pred

Out[15]: array([0, 0, 0, ..., 0, 0, 0], dtype=int64)

In [16]: from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_test, y_pred)

In [17]: cm

Out[17]: array([[1541, 69],

[ 200, 190]], dtype=int64)

In [18]: from sklearn.metrics import accuracy_score

Accuracy_Score = accuracy_score(y_test, y_pred)

In [19]: Accuracy_Score

Out[19]: 0.8655

(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
100% (5)
(Ebook) Machine Learning Algorithms in Depth (MEAP V01) by Vadim Smolyakov ISBN 9781633439214, 1633439216 download pdf
81 pages
Complete Download Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, PDF All Chapters
100% (4)
Complete Download Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, PDF All Chapters
55 pages
Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
No ratings yet
M1 - Introducing Google Cloud v5.2 - ILT
69 pages
MAchine Learning
No ratings yet
MAchine Learning
120 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Learner'S Checklist of Skills: Learning Goals Remarks/Comment A B C D
100% (24)
Learner'S Checklist of Skills: Learning Goals Remarks/Comment A B C D
4 pages
Churn For Bank Customers
No ratings yet
Churn For Bank Customers
28 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Project Report: CS 574 - Computer Vision Using Machine Learning
No ratings yet
Project Report: CS 574 - Computer Vision Using Machine Learning
38 pages
05 Logistic - Regression
No ratings yet
05 Logistic - Regression
7 pages
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
No ratings yet
ENG 202: Computers and Engineering Object Oriented Programming in PYTHON
56 pages
COMPX310-19A Machine Learning Chapter 3: Classification
No ratings yet
COMPX310-19A Machine Learning Chapter 3: Classification
39 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
No ratings yet
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
3 pages
Pandas
100% (1)
Pandas
1,131 pages
Keras Cheat Sheet Python
No ratings yet
Keras Cheat Sheet Python
1 page
IIMK - DS - W6 - Summary Deck
No ratings yet
IIMK - DS - W6 - Summary Deck
96 pages
Pandas Complete Notes
No ratings yet
Pandas Complete Notes
105 pages
Building and Evaluating ML Models
No ratings yet
Building and Evaluating ML Models
27 pages
Bank Customer Churn Analysis - Jupyter Notebook
No ratings yet
Bank Customer Churn Analysis - Jupyter Notebook
11 pages
EDA Assignment
No ratings yet
EDA Assignment
15 pages
Introduction
100% (1)
Introduction
49 pages
Python For Non-Programmers Final
No ratings yet
Python For Non-Programmers Final
218 pages
Supervised Learning With Scikit-Learn
No ratings yet
Supervised Learning With Scikit-Learn
178 pages
Building A Career in Data Science - The Overview
No ratings yet
Building A Career in Data Science - The Overview
2 pages
7 Time Series Datasets For Machine Learning
No ratings yet
7 Time Series Datasets For Machine Learning
8 pages
Introduction To Python and Computer Programming 1704298503
No ratings yet
Introduction To Python and Computer Programming 1704298503
44 pages
Analyzing Social Media Data in Python Chapter1
No ratings yet
Analyzing Social Media Data in Python Chapter1
21 pages
Data Science Learning Path For 50 Days
No ratings yet
Data Science Learning Path For 50 Days
15 pages
Module 2
No ratings yet
Module 2
20 pages
Machine Lpipearning Interview Questions: Algorithms/Tp: Q1-What's The Trade-Off Between Bias and Variance?
No ratings yet
Machine Lpipearning Interview Questions: Algorithms/Tp: Q1-What's The Trade-Off Between Bias and Variance?
46 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
44 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Regression Project
100% (1)
Regression Project
60 pages
TensorFlow With R
No ratings yet
TensorFlow With R
46 pages
Introduction To Data Visualization in Python
No ratings yet
Introduction To Data Visualization in Python
16 pages
Salary Prediction LinearRegression
100% (1)
Salary Prediction LinearRegression
7 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
No ratings yet
Exploratory Data Analysis (Eda) With Pandas: (Cheatsheet)
7 pages
Data Science
100% (2)
Data Science
38 pages
Top 50 Analytics Projects 1691221401
No ratings yet
Top 50 Analytics Projects 1691221401
52 pages
270+ Machine Learning: Projects
100% (1)
270+ Machine Learning: Projects
15 pages
Sukanya Linear LogisticRegression Report
100% (1)
Sukanya Linear LogisticRegression Report
23 pages
Time Series Summary
100% (1)
Time Series Summary
23 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
Feature Engineering
No ratings yet
Feature Engineering
13 pages
Microstrategy Tips and Techniques: Reporting Essentials Five Styles of Business Intelligence
No ratings yet
Microstrategy Tips and Techniques: Reporting Essentials Five Styles of Business Intelligence
20 pages
DATA SCIENCE INTERVIEW
No ratings yet
DATA SCIENCE INTERVIEW
32 pages
Data Science Analytics For Ordinary People PDF
No ratings yet
Data Science Analytics For Ordinary People PDF
199 pages
Data Mining Project
No ratings yet
Data Mining Project
33 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
CH 6
No ratings yet
CH 6
72 pages
Artificial Neural Networks Quiz Questions 1
No ratings yet
Artificial Neural Networks Quiz Questions 1
17 pages
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
ISO 80000-3 A Complete Guide
From Everand
ISO 80000-3 A Complete Guide
Gerardus Blokdyk
No ratings yet
Google Cloud Dataproc The Ultimate Step-By-Step Guide
From Everand
Google Cloud Dataproc The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
3 SVM - Jupyter Notebook
No ratings yet
3 SVM - Jupyter Notebook
4 pages
7 Looping Statements (While and For)
No ratings yet
7 Looping Statements (While and For)
5 pages
2 Basic of Python - Functions
No ratings yet
2 Basic of Python - Functions
3 pages
Tuple
No ratings yet
Tuple
4 pages
1 Basics of Python
No ratings yet
1 Basics of Python
6 pages
Label Encoders - Jupyter Notebook
No ratings yet
Label Encoders - Jupyter Notebook
3 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
2.basic Statistics - Jupyter Notebook
100% (1)
2.basic Statistics - Jupyter Notebook
7 pages
Hirerachical Clustering - Jupyter Notebook
No ratings yet
Hirerachical Clustering - Jupyter Notebook
4 pages
2 MLR New - Jupyter Notebook
No ratings yet
2 MLR New - Jupyter Notebook
3 pages
5 Random Forest - Jupyter Notebook
No ratings yet
5 Random Forest - Jupyter Notebook
2 pages
1 KNN - Jupyter Notebook
No ratings yet
1 KNN - Jupyter Notebook
3 pages
Numerical analysis 9ed. Edition Burden R.L. download
100% (1)
Numerical analysis 9ed. Edition Burden R.L. download
56 pages
Aircraft Bearing Bracket Analysis
No ratings yet
Aircraft Bearing Bracket Analysis
13 pages
Short-Term Actuarial Mathematics Exam-June 2021
No ratings yet
Short-Term Actuarial Mathematics Exam-June 2021
8 pages
Self-Supervised Learning For Time Series Analysis Taxonomy Progress and Prospects
No ratings yet
Self-Supervised Learning For Time Series Analysis Taxonomy Progress and Prospects
20 pages
Calculus Problem Book
No ratings yet
Calculus Problem Book
59 pages
CSE225.7 Course Outline
No ratings yet
CSE225.7 Course Outline
3 pages
Active Filter Design
No ratings yet
Active Filter Design
3 pages
Correlation Analysis Correlations: Pearson Product Moment Correlation and Spearman Rank-Order Correlation
100% (1)
Correlation Analysis Correlations: Pearson Product Moment Correlation and Spearman Rank-Order Correlation
29 pages
Soft Catenary
No ratings yet
Soft Catenary
13 pages
Addmath IG Syllabus
No ratings yet
Addmath IG Syllabus
20 pages
8 3 - Law of Cosine and Sine - Lesson Plan
No ratings yet
8 3 - Law of Cosine and Sine - Lesson Plan
1 page
Combinatorics and Probability Drill
No ratings yet
Combinatorics and Probability Drill
28 pages
MATH 152 Exams Practice Questions PDF
No ratings yet
MATH 152 Exams Practice Questions PDF
6 pages
SCIENCE-ACTIVITIES
No ratings yet
SCIENCE-ACTIVITIES
7 pages
Course Objectives:: Robotics
No ratings yet
Course Objectives:: Robotics
2 pages
ColomboCubingChristmas2023 en
No ratings yet
ColomboCubingChristmas2023 en
7 pages
Auditing II, Chapter 1 PPT
No ratings yet
Auditing II, Chapter 1 PPT
34 pages
Pengaruh
No ratings yet
Pengaruh
21 pages
5.1 Rates of Return: Holding-Period Return (HPR)
No ratings yet
5.1 Rates of Return: Holding-Period Return (HPR)
31 pages
Assignment Vu
No ratings yet
Assignment Vu
2 pages
Slimhole Clutched Down Hole Swivel Technical Datasheet
No ratings yet
Slimhole Clutched Down Hole Swivel Technical Datasheet
2 pages
Phase Test 3 Result Ity Xi
No ratings yet
Phase Test 3 Result Ity Xi
4 pages
Balancing Chemical Equations
No ratings yet
Balancing Chemical Equations
13 pages
Slac730 130308
No ratings yet
Slac730 130308
446 pages
Flood Hydrology: Methods of Flood Prediction For Rural Catchments
No ratings yet
Flood Hydrology: Methods of Flood Prediction For Rural Catchments
19 pages
O Levels Maths Intro Book
50% (2)
O Levels Maths Intro Book
2 pages
A Critical Review of Heat Transfer Through Helical Coils of Circular Cross Section PDF
No ratings yet
A Critical Review of Heat Transfer Through Helical Coils of Circular Cross Section PDF
9 pages
Ang Thomas Waterloo Automata MS
No ratings yet
Ang Thomas Waterloo Automata MS
60 pages
123
100% (1)
123
29 pages

6 XG Boost - Jupyter Notebook

Uploaded by

6 XG Boost - Jupyter Notebook

Uploaded by

In

[1]: import numpy as np

In [2]: dataset = pd.read_csv("D:\\Course\\Python\\Datasets\\Churn_Modelling.csv")

0 1 15634602 Hargrave 619 France Female 42 2 0.0

1 2 15647311 Hill 608 Spain Female 41 1 83807.8

2 3 15619304 Onio 502 France Female 42 8 159660.8

3 4 15701354 Boni 699 France Female 39 1 0.0

4 5 15737888 Mitchell 850 Spain Female 43 2 125510.8

... ... ... ... ... ... ... ... ...

9995 9996 15606229 Obijiaku 771 France Male 39 5 0.0

9996 9997 15569892 Johnstone 516 France Male 35 10 57369.6

9997 9998 15584532 Liu 709 France Female 36 7 0.0

9998 9999 15682355 Sabbatini 772 Germany Male 42 3 75075.3

9999 10000 15628319 Walker 792 France Female 28 4 130142.7

10000 rows × 14 columns

In [4]: X = dataset.iloc[:, 3:13].values

In [8]: from sklearn.compose import ColumnTransformer

In [14]: y_pred = classifier.predict(X_test)

Out[15]: array([0, 0, 0, ..., 0, 0, 0], dtype=int64)

In [16]: from sklearn.metrics import confusion_matrix

Out[17]: array([[1541, 69],

[ 200, 190]], dtype=int64)

In [18]: from sklearn.metrics import accuracy_score

You might also like