0% found this document useful (0 votes)
63 views

DSBDA Lab Assignment No 9

The document discusses plotting a box plot to visualize the distribution of ages with respect to gender and survival status using the Titanic dataset. It provides background on box plots and the necessary steps to create the visualization in Python using Seaborn.

Uploaded by

sanudantal42003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

DSBDA Lab Assignment No 9

The document discusses plotting a box plot to visualize the distribution of ages with respect to gender and survival status using the Titanic dataset. It provides background on box plots and the necessary steps to create the visualization in Python using Seaborn.

Uploaded by

sanudantal42003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Group A

Assignment No: 9
Title of the Assignment: Data Visualization II
1. Use the inbuilt dataset 'titanic' as used in the above problem. Plot a box plot for
distribution of age with respect to each gender along with the information about whether they
survived or not. (Column names : 'sex' and 'age')
2. Write observations on the inference from the above statistics.

Objective of the Assignment: Students should be able to perform the data


Visualizationoperation using Python on any open source dataset

Prerequisite:
1. Basic of Python Programming
2. Seaborn Library, Concept of Data Visualization.

Theory:
BoxPlot:
A boxplot is a standardized way of displaying the distribution of data based on a five number
summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”). It can
tell you about your outliers and what their values are. It can also tell you if your data is
symmetrical, how tightly your data is grouped, and if and how your data is skewed.

median (Q2/50th Percentile): the middle value of the dataset.

first quartile (Q1/25th Percentile): the middle number between the smallest number (not the
“minimum”) and the median of the dataset.

Third quartile (Q3/75th Percentile): the middle value between the median and the highest value (not
the “maximum”) of the dataset.

Interquartile range (IQR): 25th to the 75th percentile.


Algorithm:
1. Import the required libraries
import seaborn as sns
2. Create the datafame of inbuilt dataset titanic
dataset =
sns.load_dataset('titanic')
dataset
3. Plot a box plot for distribution of age with respect to each gender along with the
information about whether theysurvived or not.
sns.boxplot(x='sex', y='age', data = dataset)
sns.boxplot(x='sex', y='age', data =
dataset,hue='survived')
4. Write the observation from the displayed box plot.

Viva Questions
1. Explain following plots with sample diagrams:
• Histogram
• Violin Plot
2. Write code to plot a box plot for distribution of age with respect to each genderalong
with the information about whether they survived or not. Write the observations.

You might also like