6 XG Boost - Jupyter Notebook
6 XG Boost - Jupyter Notebook
In [3]: dataset
Out[3]: RowNumber CustomerId Surname CreditScore Geography Gender Age Tenure Balanc
...
In [5]: X
test1 = pd.DataFrame(X)
test1
...
In [6]: from sklearn.preprocessing import LabelEncoder, OneHotEncoder
# Converting the categorical data into Number (0 ,1 ,2)
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
In [7]: X
test2 = pd.DataFrame(X)
test2
...
In [ ]: # Creating 3 dummy varabiles for country ( Factor level of 3 spain , France and G
#onehotencoder = OneHotEncoder()
#X = onehotencoder.fit_transform(X).toarray()
#X = X[:, 1:]
#onehotencoder = OneHotEncoder()
#X = onehotencoder.fit_transform(X).toarray()
#X = X[:, 1:]
C:\Users\rgandyala\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn
\preprocessing\_encoders.py:415: FutureWarning: The handling of integer data wi
ll change in version 0.22. Currently, the categories are determined based on th
e range [0, max(values)], while in the future they will be determined based on
the unique values.
If you want the future behaviour and silence this warning, you can specify "cat
egories='auto'".
In case you used a LabelEncoder before this OneHotEncoder to convert the catego
ries to integers, then you can now use the OneHotEncoder directly.
warnings.warn(msg, FutureWarning)
In [9]: abc=pd.DataFrame(X)
abc
...
In [10]: from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
In [2]:
from xgboost.sklearn import XGBClassifier
classifier = XGBClassifier()
In [13]: classifier.fit(X_train,y_train)
...
In [15]: y_pred
In [17]: cm
In [19]: Accuracy_Score
Out[19]: 0.8655