AI in HC 4
AI in HC 4
import pandas as pd
births = pd.read_csv("births.csv") print(births.head())
births['day'].fillna(0, inplace=True) births['day'] =
births['day'].astype(int)
Output:-
Output:-
This final line is a robust estimate of the sample mean, where the
0.74 comes from the interquartile range of a Gaussian
distribution. With this we can use the query() method to filter out
rows with births outside these values:
births.pivot_table('births', index='dayofweek',
columns='decade', aggfunc='mean').plot()
plt.gca().set_xticklabels(['Mon', 'Tues', 'Wed', 'Thurs', 'Fri',
'Sat', 'Sun'])
plt.ylabel('mean births by day');
plt.sho w(
)
Output:-
Apparently, births are slightly less common on weekends than on
weekdays! Note that the 1990s and 2000s are missing because
the CDC data contains only the month of birth starting in 1989.
1
births_month = births.pivot_table('births', [births.index.month,
births.index.day])
print(births_month.head())
births_month.index = [pd.datetime(2012, month, day)for (month,
day) in births_month.index]
print(births_month.head())
Output:-
Focusing on the month and day only, we now have a time series
reflecting the average number of births by date of the year. From
this, we can use the plot method to plot the data. It reveals some
interesting trends: