0% found this document useful (0 votes)

24 views6 pages

Day 21 - Code Jupyter Notebook

Uploaded by

Rakesh Kumar Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views6 pages

Day 21 - Code Jupyter Notebook

Uploaded by

Rakesh Kumar Jha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Welcome to 30 Days ML | Day 21

Import Library
In [2]: import numpy as np
import pandas as pd
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
import seaborn as sns

Import Dataset
In [3]: df = pd.read_csv('train.csv')

In [4]: df.head()

Out[4]:
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Emb

Braund,
0 1 0 3 Mr. Owen male 22.0 1 0 A/5 21171 7.2500 NaN
Harris

Cumings,
Mrs. John
Bradley
1 2 1 1 female 38.0 1 0 PC 17599 71.2833 C85
(Florence
Briggs
Th...

Heikkinen,
STON/O2.
2 3 1 3 Miss. female 26.0 0 0 7.9250 NaN
3101282
Laina

Futrelle,
Mrs.
Jacques
3 4 1 1 female 35.0 1 0 113803 53.1000 C123
Heath
(Lily May
Peel)

Allen, Mr.
4 5 0 3 William male 35.0 0 0 373450 8.0500 NaN
Henry

In [5]: df = pd.read_csv('train.csv')[['Age','Pclass','SibSp','Parch','Survived']]

In [6]: df.head()

Out[6]:
Age Pclass SibSp Parch Survived

0 22.0 3 1 0 0

1 38.0 1 1 0 1

2 26.0 3 0 0 1

3 35.0 1 1 0 1

4 35.0 3 0 0 0
Drop NA Value
In [7]: df.dropna(inplace=True)

In [8]: df.sample(5)

Out[8]:
Age Pclass SibSp Parch Survived

663 36.0 3 0 0 0

498 25.0 1 1 2 0

342 28.0 2 0 0 0

136 19.0 1 0 2 1

884 25.0 3 0 0 0

Separate X and Y
In [9]: X = df.iloc[:,0:4]
y = df.iloc[:,-1]

In [10]: X.head()

Out[10]:
Age Pclass SibSp Parch

0 22.0 3 1 0

1 38.0 1 1 0

2 26.0 3 0 0

3 35.0 1 1 0

4 35.0 3 0 0

Check Accuracy for Logistic Regression

In [11]: np.mean(cross_val_score(LogisticRegression(),X,y,scoring='accuracy',cv=20))

Out[11]: 0.6933333333333332

Applying Feature Construction

Create New Column

In [12]: X['Family_size'] = X['SibSp'] + X['Parch'] + 1
In [13]: X.head()

Out[13]:
Age Pclass SibSp Parch Family_size

0 22.0 3 1 0 2

1 38.0 1 1 0 2

2 26.0 3 0 0 1

3 35.0 1 1 0 2

4 35.0 3 0 0 1

Apply New Function

In [14]: def myfunc(num):
if num == 1:
#alone
return 0
elif num >1 and num <=4:
# small family
return 1
else:
# large family
return 2

In [15]: myfunc(4)

Out[15]: 1

Apply M Function
In [16]: X['Family_type'] = X['Family_size'].apply(myfunc)

In [17]: X.head()

Out[17]:
Age Pclass SibSp Parch Family_size Family_type

0 22.0 3 1 0 2 1

1 38.0 1 1 0 2 1

2 26.0 3 0 0 1 0

3 35.0 1 1 0 2 1

4 35.0 3 0 0 1 0

Drop unwanted columns

In [18]: X.drop(columns=['SibSp','Parch','Family_size'],inplace=True)
In [19]: X.head()

Out[19]:
Age Pclass Family_type

0 22.0 3 1

1 38.0 1 1

2 26.0 3 0

3 35.0 1 1

4 35.0 3 0

Review Accuracy after Feature Construction

In [20]: np.mean(cross_val_score(LogisticRegression(),X,y,scoring='accuracy',cv=20))

Out[20]: 0.7003174603174602

Feature Splitting (New Topic)

Review Import Data

In [21]: df = pd.read_csv('train.csv')

In [22]: df.head()

Out[22]:
PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Emb

Braund,
0 1 0 3 Mr. Owen male 22.0 1 0 A/5 21171 7.2500 NaN
Harris

Cumings,
Mrs. John
Bradley
1 2 1 1 female 38.0 1 0 PC 17599 71.2833 C85
(Florence
Briggs
Th...

Heikkinen,
STON/O2.
2 3 1 3 Miss. female 26.0 0 0 7.9250 NaN
3101282
Laina

Futrelle,
Mrs.
Jacques
3 4 1 1 female 35.0 1 0 113803 53.1000 C123
Heath
(Lily May
Peel)

Allen, Mr.
4 5 0 3 William male 35.0 0 0 373450 8.0500 NaN
Henry

In [23]: #Use Name Column

In [24]: df['Name']

Out[24]: 0 Braund, Mr. Owen Harris

1 Cumings, Mrs. John Bradley (Florence Briggs Th...
2 Heikkinen, Miss. Laina
3 Futrelle, Mrs. Jacques Heath (Lily May Peel)
4 Allen, Mr. William Henry
...
886 Montvila, Rev. Juozas
887 Graham, Miss. Margaret Edith
888 Johnston, Miss. Catherine Helen "Carrie"
889 Behr, Mr. Karl Howell
890 Dooley, Mr. Patrick
Name: Name, Length: 891, dtype: object

Separate Salutation
In [25]: df['Title'] = df['Name'].str.split(', ', expand=True)[1].str.split('.', expand=True)[0

In [26]:
df['Name'].str.split(', ', expand=True)[1].str.split('.', expand=True)[0]

Out[26]: 0 Mr
1 Mrs
2 Miss
3 Mrs
4 Mr
...
886 Rev
887 Miss
888 Miss
889 Mr
890 Mr
Name: 0, Length: 891, dtype: object

In [27]: df[['Title','Name']]

Out[27]:
Title Name

0 Mr Braund, Mr. Owen Harris

1 Mrs Cumings, Mrs. John Bradley (Florence Briggs Th...

2 Miss Heikkinen, Miss. Laina

3 Mrs Futrelle, Mrs. Jacques Heath (Lily May Peel)

4 Mr Allen, Mr. William Henry

... ... ...

886 Rev Montvila, Rev. Juozas

887 Miss Graham, Miss. Margaret Edith

888 Miss Johnston, Miss. Catherine Helen "Carrie"

889 Mr Behr, Mr. Karl Howell

890 Mr Dooley, Mr. Patrick

891 rows × 2 columns

Review Analysis after Feature Splitting
In [28]: np.mean(cross_val_score(LogisticRegression(),X,y,scoring='accuracy',cv=20))

Out[28]: 0.7003174603174602

Only Pandas
No ratings yet
Only Pandas
8 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
CHAPTER 4 Demand and Supply
No ratings yet
CHAPTER 4 Demand and Supply
12 pages
Assignment 2 Mlo
No ratings yet
Assignment 2 Mlo
9 pages
Assignment 5
No ratings yet
Assignment 5
14 pages
Aiml Lab04&5 - Output
No ratings yet
Aiml Lab04&5 - Output
18 pages
AM19 EDA Assignment1
No ratings yet
AM19 EDA Assignment1
13 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
2524c225-2e58-4d21-8bba-8fda084be465_Programs_Week_10
No ratings yet
2524c225-2e58-4d21-8bba-8fda084be465_Programs_Week_10
11 pages
Dev Assignment - 1
No ratings yet
Dev Assignment - 1
6 pages
Titanic
100% (2)
Titanic
13 pages
Data cleaning and exploratory analysis on a public dataset
No ratings yet
Data cleaning and exploratory analysis on a public dataset
11 pages
vertopal.com_homework1
No ratings yet
vertopal.com_homework1
17 pages
Data Cleaning by Manish Batra 1697684636
No ratings yet
Data Cleaning by Manish Batra 1697684636
30 pages
EXPERIMENT 2 - Colab
No ratings yet
EXPERIMENT 2 - Colab
2 pages
23L-2589 Lab 10
No ratings yet
23L-2589 Lab 10
17 pages
Dataset Visualization Basic Ml-1
No ratings yet
Dataset Visualization Basic Ml-1
12 pages
What Are Decision Trees?
No ratings yet
What Are Decision Trees?
9 pages
Titanic Data Tab
No ratings yet
Titanic Data Tab
6 pages
Day 20
No ratings yet
Day 20
5 pages
78e76892-73ab-431a-b66d-3423f8109793_Programs_Week7
No ratings yet
78e76892-73ab-431a-b66d-3423f8109793_Programs_Week7
14 pages
Pclass Survived Name Sex Age Sibsp Parch Ticket Fare
No ratings yet
Pclass Survived Name Sex Age Sibsp Parch Ticket Fare
74 pages
Titanic 3
No ratings yet
Titanic 3
74 pages
Pclass Survived Name Sex Age Sibsp Parch Ticket Embarked
No ratings yet
Pclass Survived Name Sex Age Sibsp Parch Ticket Embarked
38 pages
DL Assignment 1
No ratings yet
DL Assignment 1
7 pages
Hotel
No ratings yet
Hotel
16 pages
Onkar exp 3 - Jupyter Notebook
No ratings yet
Onkar exp 3 - Jupyter Notebook
2 pages
Ex 5,6,7,8 MLLab
No ratings yet
Ex 5,6,7,8 MLLab
50 pages
Titanic
No ratings yet
Titanic
28 pages
PANDAS groupby continues 2
No ratings yet
PANDAS groupby continues 2
5 pages
Rajat DM
No ratings yet
Rajat DM
54 pages
Pandas Toolkit
No ratings yet
Pandas Toolkit
44 pages
Titanicdf
No ratings yet
Titanicdf
53 pages
TITANIC CLASSIFICATION - Task1
No ratings yet
TITANIC CLASSIFICATION - Task1
2 pages
7 8 - Missing Value Handling
No ratings yet
7 8 - Missing Value Handling
4 pages
ML Lab File
No ratings yet
ML Lab File
19 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
28 pages
Titanic Survival Prediction Ml
No ratings yet
Titanic Survival Prediction Ml
36 pages
ml dataset performance
No ratings yet
ml dataset performance
11 pages
Data Cleaning and Manipulation in Python
No ratings yet
Data Cleaning and Manipulation in Python
33 pages
Titanic Stats
No ratings yet
Titanic Stats
17 pages
Unit 5 Analysis with Pandas in python
No ratings yet
Unit 5 Analysis with Pandas in python
26 pages
Titanic Data for Pivot
No ratings yet
Titanic Data for Pivot
75 pages
178 - NaiveBaye's.ipynb - Colab
No ratings yet
178 - NaiveBaye's.ipynb - Colab
3 pages
Python for Machine Learning
No ratings yet
Python for Machine Learning
33 pages
BDM
No ratings yet
BDM
9 pages
✌️???? ????????????✌️???? ??????
No ratings yet
✌️???? ????????????✌️???? ??????
63 pages
Titanic Data
No ratings yet
Titanic Data
112 pages
Homework2
No ratings yet
Homework2
12 pages
Data Wrangling (Data Preprocessing) : Practical Assessment 1
No ratings yet
Data Wrangling (Data Preprocessing) : Practical Assessment 1
5 pages
Full Titanic Semicolon
No ratings yet
Full Titanic Semicolon
31 pages
Assignment Data Science
No ratings yet
Assignment Data Science
2 pages
day20
No ratings yet
day20
5 pages
Titanic eda
No ratings yet
Titanic eda
17 pages
assignment1
No ratings yet
assignment1
2 pages
PRAC3_23BME053
No ratings yet
PRAC3_23BME053
5 pages
10 - Eda To Prediction Dietanic
No ratings yet
10 - Eda To Prediction Dietanic
21 pages
Sampling Titanic
No ratings yet
Sampling Titanic
49 pages
Titanic 1111
No ratings yet
Titanic 1111
31 pages
Titanic Test
No ratings yet
Titanic Test
30 pages
Practise Mathematics Grade 7 Book 8
From Everand
Practise Mathematics Grade 7 Book 8
Esther Chen
5/5 (1)
a hand study guide to tale of tamali by chinodya
No ratings yet
a hand study guide to tale of tamali by chinodya
17 pages
Get The Gold For App Inventor 2
No ratings yet
Get The Gold For App Inventor 2
6 pages
Quiz - MCQ - 2-23-2024, 3-51-16 PM
No ratings yet
Quiz - MCQ - 2-23-2024, 3-51-16 PM
5 pages
Microsoft Windows 7 In Depth 1st Edition Robert Cowart All Chapters Instant Download
100% (4)
Microsoft Windows 7 In Depth 1st Edition Robert Cowart All Chapters Instant Download
71 pages
Chapter 3 - Techniquse of Integration
No ratings yet
Chapter 3 - Techniquse of Integration
45 pages
GT100 Turbojet Datasheet
No ratings yet
GT100 Turbojet Datasheet
3 pages
Web Access Management and Single Sign-On: Ronnie Dale Huggins
No ratings yet
Web Access Management and Single Sign-On: Ronnie Dale Huggins
9 pages
D800050.G0 Tools 8000-1
No ratings yet
D800050.G0 Tools 8000-1
12 pages
Dart
No ratings yet
Dart
70 pages
unit-2_HTML (2)
No ratings yet
unit-2_HTML (2)
52 pages
Info Sourse
No ratings yet
Info Sourse
54 pages
ISO 13492 2007 (E) - Character PDF Document
No ratings yet
ISO 13492 2007 (E) - Character PDF Document
18 pages
Step 2: Step 3: Step 1:: N600 Wireless Dual Band Gigabit Router
No ratings yet
Step 2: Step 3: Step 1:: N600 Wireless Dual Band Gigabit Router
2 pages
HyperFlex Edge Customer Cleanup Guide-V1
No ratings yet
HyperFlex Edge Customer Cleanup Guide-V1
34 pages
Veda Yurtsever - Google Search
No ratings yet
Veda Yurtsever - Google Search
1 page
SAP FICO Supporter Handbook
100% (1)
SAP FICO Supporter Handbook
35 pages
JNTUK B.Tech 4-2 Sem R16 TT April 2023
No ratings yet
JNTUK B.Tech 4-2 Sem R16 TT April 2023
3 pages
Theory Proposed by Judith Graves and Sheila Corcoran's Model (1989)
No ratings yet
Theory Proposed by Judith Graves and Sheila Corcoran's Model (1989)
2 pages
Bill
100% (1)
Bill
3 pages
PMM Module
No ratings yet
PMM Module
7 pages
Kstar Inverosores Datasheet
No ratings yet
Kstar Inverosores Datasheet
1 page
GNSS Receivers, Data Colletors and Radio
No ratings yet
GNSS Receivers, Data Colletors and Radio
10 pages
PHP (Unit - Ii) - 4
No ratings yet
PHP (Unit - Ii) - 4
69 pages
Hiring Process Analytics
No ratings yet
Hiring Process Analytics
4 pages
research paper chatbot
No ratings yet
research paper chatbot
5 pages
Computer Science a Demo Class
No ratings yet
Computer Science a Demo Class
33 pages
1
No ratings yet
1
79 pages
Functional Specification Template
100% (1)
Functional Specification Template
24 pages
Exponent Properties PDF
No ratings yet
Exponent Properties PDF
2 pages

Day 21 - Code Jupyter Notebook

Uploaded by

Day 21 - Code Jupyter Notebook

Uploaded by

Welcome to 30 Days ML | Day 21

Check Accuracy for Logistic Regression

Applying Feature Construction

Create New Column

Apply New Function

Drop unwanted columns

Review Accuracy after Feature Construction

Feature Splitting (New Topic)

Review Import Data

In [23]: #Use Name Column

Out[24]: 0 Braund, Mr. Owen Harris

0 Mr Braund, Mr. Owen Harris

1 Mrs Cumings, Mrs. John Bradley (Florence Briggs Th...

2 Miss Heikkinen, Miss. Laina

3 Mrs Futrelle, Mrs. Jacques Heath (Lily May Peel)

4 Mr Allen, Mr. William Henry

... ... ...

886 Rev Montvila, Rev. Juozas

887 Miss Graham, Miss. Margaret Edith

888 Miss Johnston, Miss. Catherine Helen "Carrie"

889 Mr Behr, Mr. Karl Howell

890 Mr Dooley, Mr. Patrick

891 rows × 2 columns

You might also like