0% found this document useful (0 votes)
35 views2 pages

Olympics

it is the dataset that helps to practice python data formating

Uploaded by

Ch Umar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views2 pages

Olympics

it is the dataset that helps to practice python data formating

Uploaded by

Ch Umar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Class Assignment: Olympics Dataset Analysis

Dataset: You will be working with the provided olympics.csv dataset, which
contains information about Olympic athletes.
Tasks:

Part 1: Data Cleaning and Transformation in Excel


1. Remove Duplicates: Remove any duplicate rows based on the entire
dataset.
2. Handle Missing Values: Identify and handle missing values in the Age
column. Use the average age for imputation.
3. Create New Column: Create a new column named Age Category with
the following categories:
• Teen if Age < 20
• Young Adult if 20 <= Age < 30
• Adult if 30 <= Age < 40
• Senior if Age >= 40
4. Filter Data: Filter the data to show only rows where the Medal column
is not equal to 0.
5. Sort Data: Sort the data by Year in ascending order and then by Name
in alphabetical order.

Part 2: Data Cleaning and Transformation in Python (pan-


das/numpy)
1. Load Data:
• Load the dataset using pandas.
2. Remove Duplicates:
• Remove any duplicate rows.
3. Handle Missing Values:
• Identify and fill missing values in the Age column with the mean age.
4. Create New Column:
• Create a new column Age Category with the same categories as in
the Excel task.
5. Filter Data:
• Filter the data to include only rows where Medal is not 0.
6. Sort Data:
• Sort the data by Year in ascending order and by Name in alphabetical
order.
7. Group and Aggregate:
• Group the data by Team and Year and find the total number of medals
won by each team in each year.

Part 3: Analysis and Visualization


1. Medal Distribution by Country:

1
• Create a bar chart showing the total number of medals won by each
country.
2. Age Distribution:
• Create a histogram of the age distribution of athletes.
3. Gender Distribution:
• Create a pie chart showing the proportion of male and female ath-
letes.

Submission
• Save your Excel file with the cleaned and transformed data.
• Submit your Python code and output visualizations.

You might also like