0% found this document useful (0 votes)
7 views

Dev Assignment - 1

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Dev Assignment - 1

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

9/23/24, 8:23 PM DEV ASSIGNMENT--1

In [ ]:

In [19]: import pandas as pd


import matplotlib.pyplot as plt
import seaborn as sns

# Load the dataset


df = pd.read_csv("titanic.csv")

# Display the first few rows


print("head:")
print(df.head())

# Display the last few rows


print("\ntail:")
print(df.tail())

# Display information about the dataset


print("\ninfo:")
print(df.info())

# Display statistical summary


print("\ndescribe:")
print(df.describe())

# Display the shape of the dataset


print("\nshape")
print(df.shape)

# Display the column names


print("\ncolumns:")
print(df.columns)

# Visualize age distribution


plt.figure(figsize=(10, 6))
sns.histplot(df['age'], bins=30, kde=True)
plt.title('Age distribution of passengers')
plt.xlabel('age')
plt.ylabel('frequency')
plt.show()

# Visualize survival count by gender


plt.figure(figsize=(10, 6))
sns.countplot(data=df, x='sex', hue='survived')
plt.title('Survived Count by Gender')
plt.xlabel('gender')
plt.ylabel('count')
plt.legend(title='survived', loc='upper right', labels=['Did not Survive', 'survive
plt.show()

# Add a new column for family size


df['FamilySize'] = df['sibsp'] + df['parch'] + 1

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 1/6


9/23/24, 8:23 PM DEV ASSIGNMENT--1

# Visualize survival rate by family size


plt.figure(figsize=(10, 6))
sns.barplot(data=df, x='FamilySize', y='survived')
plt.title('Survival Rate by Family Size')
plt.xlabel('FamilySize')
plt.ylabel('Survival Rate')
plt.show()

# Create a pivot table for survival rates by passenger class and gender
pivot_table = df.pivot_table(values='survived', index='pclass', columns='sex', aggf
print("\npivot table for survival rates by passenger class and gender:")
print(pivot_table)

# Visualize the pivot table


pivot_table.plot(kind='bar', figsize=(10, 6))
plt.title('Survival Rate by Passenger Class and Gender')
plt.ylabel('Survival Rate')
plt.xlabel('pclass')
plt.xticks(rotation=0)
plt.legend(title='gender', labels=['female', 'male'])
plt.show()

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 2/6


9/23/24, 8:23 PM DEV ASSIGNMENT--1

head:
survived pclass name \
0 0 3 Braund, Mr. Owen Harris
1 1 1 Cumings, Mrs. John Bradley (Florence Briggs Th...
2 1 3 Heikkinen, Miss. Laina
3 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel)
4 0 3 Allen, Mr. William Henry

sex age fare sibsp parch


0 male 22.0 7.2500 1 0
1 female 38.0 71.2833 1 0
2 female 26.0 7.9250 0 0
3 female 35.0 53.1000 1 0
4 male 35.0 8.0500 0 0

tail:
survived pclass name sex age \
709 0 3 Rice, Mrs. William (Margaret Norton) female 39.0
710 0 2 Montvila, Rev. Juozas male 27.0
711 1 1 Graham, Miss. Margaret Edith female 19.0
712 1 1 Behr, Mr. Karl Howell male 26.0
713 0 3 Dooley, Mr. Patrick male 32.0

fare sibsp parch


709 29.125 0 5
710 13.000 0 0
711 30.000 0 0
712 30.000 0 0
713 7.750 0 0

info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 714 entries, 0 to 713
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 survived 714 non-null int64
1 pclass 714 non-null int64
2 name 714 non-null object
3 sex 714 non-null object
4 age 714 non-null float64
5 fare 714 non-null float64
6 sibsp 714 non-null int64
7 parch 714 non-null int64
dtypes: float64(2), int64(4), object(2)
memory usage: 44.8+ KB
None

describe:
survived pclass age fare sibsp parch
count 714.000000 714.000000 714.000000 714.000000 714.000000 714.000000
mean 0.406162 2.236695 29.699118 34.694514 0.512605 0.431373
std 0.491460 0.838250 14.526497 52.918930 0.929783 0.853289
min 0.000000 1.000000 0.420000 0.000000 0.000000 0.000000
25% 0.000000 1.000000 20.125000 8.050000 0.000000 0.000000
50% 0.000000 2.000000 28.000000 15.741700 0.000000 0.000000

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 3/6


9/23/24, 8:23 PM DEV ASSIGNMENT--1

75% 1.000000 3.000000 38.000000 33.375000 1.000000 1.000000


max 1.000000 3.000000 80.000000 512.329200 5.000000 6.000000

shape
(714, 8)

columns:
Index(['survived', 'pclass', 'name', 'sex', 'age', 'fare', 'sibsp', 'parch'], dtype
='object')

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 4/6


9/23/24, 8:23 PM DEV ASSIGNMENT--1

pivot table for survival rates by passenger class and gender:


sex female male
pclass
1 0.964706 0.396040
2 0.918919 0.151515
3 0.460784 0.150198

In [ ]:

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 5/6


9/23/24, 8:23 PM DEV ASSIGNMENT--1

In [ ]:

file:///C:/Users/yoges/Downloads/DEV ASSIGNMENT----1.html 6/6

You might also like