Box Plot Data-Aggregation To Normalization DJB Notes 25-04-2024
Box Plot Data-Aggregation To Normalization DJB Notes 25-04-2024
● Whiskers: The whiskers extend from the edges of the box to the
minimum and maximum values within 1.5 times the IQR from the
first and third quartiles, respectively. They represent the range of
the data, excluding outliers.
● Outliers: Data points that fall outside the whiskers are considered
outliers and are plotted individually as points. They represent data
values that are significantly different from the rest of the dataset.
● Boxplots are particularly useful for comparing distributions between
different groups or variables and identifying potential outliers. They
provide a visual summary of the data's spread, skewness, and central
tendency in a single plot, making them a valuable tool in exploratory
data analysis and statistical analysis.
Box Plots
● import pandas as pd
● import matplotlib.pyplot as plt
● plt.tight_layout()
● plt.show()
Data aggregation and grouping
with Pandas
● In Pandas, data aggregation and grouping are fundamental
operations for data analysis. Here's a basic overview of how to
perform these tasks:
● Grouping Data: The groupby() function is used to split the data
into groups based on some criteria.
import pandas as pd
# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob'],
'Score': [85, 92, 78, 90, 88]}
df = pd.DataFrame(data)
# Grouping by 'Name'
grouped = df.groupby('Name')
# Displaying groups
for name, group in grouped:
print(name)
print(group)
Aggregating Data:
# Example data
data = [1, 2, 3, 4, 5]
# Example data
data = [1, 2, 3, 4, 5]