AI in HC 4

The document outlines an experiment aimed at analyzing birth rates in the United States using Python and data from the CDC. It details the process of data cleaning, visualization of birth trends by decade and day of the week, and highlights that male births consistently outnumber female births. Additionally, it explores average births by date of the year, revealing interesting trends in birth rates throughout the year.

Uploaded by

muni.kundalaiml2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

0 views

AI in HC 4

Uploaded by

muni.kundalaiml2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Experiment - 4

Aim:- Birth Rate Analysis using Python.

Theory:- Let’s take a look at the freely available data on births

in the United States, provided by the Centers for Disease Control
(CDC). This data can be found at births.csv

import pandas as pd
births = pd.read_csv("births.csv") print(births.head())
births['day'].fillna(0, inplace=True) births['day'] =
births['day'].astype(int)

Output:-

births['decade'] = 10 * (births['year'] // 10)

births.pivot_table('births', index='decade', columns='gender',
aggfunc='sum')
print(births.head())

We immediately see that male births outnumber female births in

every decade. To see this trend a bit more clearly, we can use the
built-in plotting tools in Pandas to visualize the total number of
births by year :

import matplotlib.pyplot as plt

import seaborn as sns
sns.set()
birth_decade = births.pivot_table('births', index='decade',
columns='gender', aggfunc='sum')
birth_decade.plot()
plt.ylabel("Total births per year")
plt.show()

Output:-

Further data exploration:

There are a few interesting features we can pull out of this
dataset using the Pandas tools. We must start by cleaning the
data a bit, removing outliers caused by mistyped dates or
missing values. One easy way to remove these all at once is to
cut outliers, we’ll do this via a robust sigma-clipping operation:
import numpy as np
quartiles = np.percentile(births['births'], [25, 50, 75])
mu = quartiles[1]
sig = 0.74 * (quartiles[2] - quartiles[0])

This final line is a robust estimate of the sample mean, where the
0.74 comes from the interquartile range of a Gaussian
distribution. With this we can use the query() method to filter out
rows with births outside these values:

births = births.query('(births > @mu - 5 * @sig) & (births < @mu

+ 5 * @sig)')
births['day'] = births['day'].astype(int)
births.index = pd.to_datetime(10000 * births.year + 100 *
births.month + births.day, format='%Y%m%d')
births['dayofweek'] = births.index.dayofweek

Using this we can plot births by weekday for several decades:

births.pivot_table('births', index='dayofweek',
columns='decade', aggfunc='mean').plot()
plt.gca().set_xticklabels(['Mon', 'Tues', 'Wed', 'Thurs', 'Fri',
'Sat', 'Sun'])
plt.ylabel('mean births by day');
plt.sho w(
)

Output:-
Apparently, births are slightly less common on weekends than on
weekdays! Note that the 1990s and 2000s are missing because
the CDC data contains only the month of birth starting in 1989.

1
births_month = births.pivot_table('births', [births.index.month,
births.index.day])
print(births_month.head())
births_month.index = [pd.datetime(2012, month, day)for (month,
day) in births_month.index]
print(births_month.head())

Output:-

Focusing on the month and day only, we now have a time series
reflecting the average number of births by date of the year. From
this, we can use the plot method to plot the data. It reveals some
interesting trends:

fig, ax = plt.subplots(figsize=(12, 4))

births_month.plot(ax=ax)
plt.show()
Output:-

Heraeus TC E4 Regelaar Operating Instructions ENG
100% (1)
Heraeus TC E4 Regelaar Operating Instructions ENG
37 pages
S1 CS - U4 Data Ranges - Frequencies - Shifting
No ratings yet
S1 CS - U4 Data Ranges - Frequencies - Shifting
24 pages
Grammar Presentation C1 Unit 1
100% (1)
Grammar Presentation C1 Unit 1
16 pages
Lab
No ratings yet
Lab
4 pages
Python Data Analysis Visualization
No ratings yet
Python Data Analysis Visualization
34 pages
Dataframes and Series: Allen Downey
No ratings yet
Dataframes and Series: Allen Downey
27 pages
Math Project Report
No ratings yet
Math Project Report
4 pages
Probability Mass Functions: Allen Downey
No ratings yet
Probability Mass Functions: Allen Downey
37 pages
Project 1: Descriptive Analysis of Demographic Data: TU Dortmund
No ratings yet
Project 1: Descriptive Analysis of Demographic Data: TU Dortmund
20 pages
EDA Report (1)
No ratings yet
EDA Report (1)
10 pages
Big Data Analytics Lab File
No ratings yet
Big Data Analytics Lab File
61 pages
Hints and Answers
No ratings yet
Hints and Answers
13 pages
Data Visualization - Plotly
100% (1)
Data Visualization - Plotly
106 pages
Final Project
No ratings yet
Final Project
1 page
Lec 05-DSFa23 data science
No ratings yet
Lec 05-DSFa23 data science
65 pages
EDA Python Code Cheatsheets
No ratings yet
EDA Python Code Cheatsheets
52 pages
4130FinalProject_ Juan Barbecho
No ratings yet
4130FinalProject_ Juan Barbecho
7 pages
Corelation Pandas - DC
No ratings yet
Corelation Pandas - DC
30 pages
Limits of Simple Regression: Allen Downey
No ratings yet
Limits of Simple Regression: Allen Downey
43 pages
Week2 lab
No ratings yet
Week2 lab
8 pages
PMT2 20
No ratings yet
PMT2 20
32 pages
WS#3 Python Data Science Toolbox - Nitro
No ratings yet
WS#3 Python Data Science Toolbox - Nitro
6 pages
Pandas PDF
No ratings yet
Pandas PDF
6 pages
Lec 05-DSFa23
No ratings yet
Lec 05-DSFa23
65 pages
Data Science & Analytics Lab Manual
No ratings yet
Data Science & Analytics Lab Manual
39 pages
Presentation 16 Demo
No ratings yet
Presentation 16 Demo
12 pages
Introduction To Exploratory Data Analysis: Justin Bois
No ratings yet
Introduction To Exploratory Data Analysis: Justin Bois
45 pages
Exploratory Data Analysis
100% (1)
Exploratory Data Analysis
203 pages
Lecture 4 - Data Wrangling
No ratings yet
Lecture 4 - Data Wrangling
41 pages
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
No ratings yet
Chapter 3 Introduction To Data Science A Python Approach To Concepts, Techniques and Applications
22 pages
Project Group3 MAS291
No ratings yet
Project Group3 MAS291
30 pages
Preprocessing - Preprocessing Your Data With R
No ratings yet
Preprocessing - Preprocessing Your Data With R
23 pages
Python For Exploratory Data Analysis
No ratings yet
Python For Exploratory Data Analysis
12 pages
Lec 7 Data Visualization Basic Statistics Updated 21102024 122008pm
No ratings yet
Lec 7 Data Visualization Basic Statistics Updated 21102024 122008pm
39 pages
Statistical Thinking in Python I: Introduction To Exploratory Data Analysis
No ratings yet
Statistical Thinking in Python I: Introduction To Exploratory Data Analysis
41 pages
Stip Ch1 Slides
No ratings yet
Stip Ch1 Slides
41 pages
ch1 Slides PDF
No ratings yet
ch1 Slides PDF
41 pages
Statistical Thinking in Python I: Introduction To Exploratory Data Analysis
No ratings yet
Statistical Thinking in Python I: Introduction To Exploratory Data Analysis
41 pages
Datascience Session2
No ratings yet
Datascience Session2
10 pages
Intreoduction To Python Basic Plots With Matplolib
No ratings yet
Intreoduction To Python Basic Plots With Matplolib
37 pages
Uob Python Lecture2p
No ratings yet
Uob Python Lecture2p
22 pages
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
100% (1)
Comprehensive Guide Data Exploration Sas Using Python Numpy Scipy Matplotlib Pandas
12 pages
DAV Guidelines
No ratings yet
DAV Guidelines
4 pages
Ds Module 2
No ratings yet
Ds Module 2
36 pages
Unit 4 - Statistical Thinking
No ratings yet
Unit 4 - Statistical Thinking
59 pages
Pakistan Bureau of Statistics: Demographic Analysis & Population Projection (Dapps)
No ratings yet
Pakistan Bureau of Statistics: Demographic Analysis & Population Projection (Dapps)
20 pages
Rajendra Task-2
No ratings yet
Rajendra Task-2
15 pages
Aditya DM Prac File
No ratings yet
Aditya DM Prac File
35 pages
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
100% (9)
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
142 pages
Urban Planning: Lecture 6: Population Projection & Demand Analysis
No ratings yet
Urban Planning: Lecture 6: Population Projection & Demand Analysis
34 pages
Line Plot (1) : Datacamp Courses-Jhu-Genomics-Demo
No ratings yet
Line Plot (1) : Datacamp Courses-Jhu-Genomics-Demo
22 pages
PythonForMachineLearning
No ratings yet
PythonForMachineLearning
66 pages
Machine Learning Engineer Nanodegree Supervised Learning Project: Finding Donors For CharityML
No ratings yet
Machine Learning Engineer Nanodegree Supervised Learning Project: Finding Donors For CharityML
16 pages
DEMOGRAPHY_CAT_2025_Simplified
No ratings yet
DEMOGRAPHY_CAT_2025_Simplified
2 pages
Vijaya Lakshman Task-2
No ratings yet
Vijaya Lakshman Task-2
15 pages
Eda - 1@3pm 8th Nov
No ratings yet
Eda - 1@3pm 8th Nov
2 pages
DXV Guidelines
No ratings yet
DXV Guidelines
3 pages
'Red' 'Black' 'Green': Trial
No ratings yet
'Red' 'Black' 'Green': Trial
4 pages
DATA SCIENCE EXPERIMENTS
No ratings yet
DATA SCIENCE EXPERIMENTS
31 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Excel Simulations
From Everand
Excel Simulations
Gerard M. Verschuuren
3.5/5 (2)
Start Predicting In A World Of Data Science And Predictive Analysis
From Everand
Start Predicting In A World Of Data Science And Predictive Analysis
Matthew Abbitt
No ratings yet
DL experiment - 1
No ratings yet
DL experiment - 1
10 pages
AI in HC - 3
No ratings yet
AI in HC - 3
3 pages
Computer Vision Lecture Notes All Compress
No ratings yet
Computer Vision Lecture Notes All Compress
17 pages
DL experiment 2
No ratings yet
DL experiment 2
9 pages
DL experiment 3
No ratings yet
DL experiment 3
3 pages
Solow Model Extension-Human Capital
No ratings yet
Solow Model Extension-Human Capital
16 pages
L835-Excel I20 202005
No ratings yet
L835-Excel I20 202005
6,968 pages
One of Them Days Us English
No ratings yet
One of Them Days Us English
14 pages
Lecture 12 13 DFDs
No ratings yet
Lecture 12 13 DFDs
68 pages
Isilon OneFS 7.1 - GUI
No ratings yet
Isilon OneFS 7.1 - GUI
434 pages
Bronzeystrainer - Kvs 200 T
No ratings yet
Bronzeystrainer - Kvs 200 T
1 page
GPW-2000 Instructions PDF V3.9 WUP69
No ratings yet
GPW-2000 Instructions PDF V3.9 WUP69
17 pages
Kusunoki-Dieholder - Our Die Holders
No ratings yet
Kusunoki-Dieholder - Our Die Holders
5 pages
Request For Publication of Vacant Positions
No ratings yet
Request For Publication of Vacant Positions
7 pages
91569-00 Fetfundamentals SW Ed4 Pr2 Web
100% (1)
91569-00 Fetfundamentals SW Ed4 Pr2 Web
66 pages
AH THIAN v. GOVERNMENT OF MALAYSIA
No ratings yet
AH THIAN v. GOVERNMENT OF MALAYSIA
5 pages
ICT Development in Bangladesh
100% (1)
ICT Development in Bangladesh
2 pages
The National Art Center Tokyo
No ratings yet
The National Art Center Tokyo
2 pages
CAF Revision Notes -1
No ratings yet
CAF Revision Notes -1
19 pages
Everest Industries Limited
No ratings yet
Everest Industries Limited
25 pages
Eu MDR
No ratings yet
Eu MDR
10 pages
HTW 10-1 - For The Tenth Session of The Sub-Committee To Be Held at IMO Headquarters, 4 Albert Embank... (Secretariat)
No ratings yet
HTW 10-1 - For The Tenth Session of The Sub-Committee To Be Held at IMO Headquarters, 4 Albert Embank... (Secretariat)
3 pages
Association Rule Mining
No ratings yet
Association Rule Mining
17 pages
UILCS JavaTopicList2324
No ratings yet
UILCS JavaTopicList2324
2 pages
Audit Network Checklist Whitepaper
No ratings yet
Audit Network Checklist Whitepaper
25 pages
2003 Fall Choice Zippo Lighter Catalog
No ratings yet
2003 Fall Choice Zippo Lighter Catalog
24 pages
NEC EPABX Instln Manual
No ratings yet
NEC EPABX Instln Manual
96 pages
Multi Pulse Converter
No ratings yet
Multi Pulse Converter
5 pages
F 8832
67% (3)
F 8832
6 pages
Inventory EOQ
No ratings yet
Inventory EOQ
50 pages
Jai-Alai v. BPI
No ratings yet
Jai-Alai v. BPI
2 pages
Stats-Module-2 probability distribution-DONE
No ratings yet
Stats-Module-2 probability distribution-DONE
44 pages
Sre - CSC4126
No ratings yet
Sre - CSC4126
4 pages