PythonFinalProject
PythonFinalProject
Project Overview:
In this project, you will select a dataset, perform basic analysis using Pandas, and create
visualizations using Matplotlib. The objective is to apply Python programming skills,
including functional programming / OOP (Object-Oriented Programming) principles, to
extract meaningful insights/trends from the chosen dataset and communicate findings
effectively.
Project Objectives:
Project Instructions :
1. Dataset Selection:
o Choose a simple dataset that interests your team (e.g., sports, weather,
environment, business, or healthcare…).
o Download the dataset in CSV format from, for example you can use one of
the following websites, or any other websites of your choice:
Kaggle
Google Dataset Search
o Obtain instructor approval before proceeding.
2. Basic Data Analysis Tasks using Pandas:
o Inspect the dataset: Display a description of the dataset, display the first and
last 10 rows of the dataset, and list all column names along with their data
types. Additionally, identify and report the total number of missing values in
each column.
o Summarize data: Calculate basic statistics for numerical columns (e.g.,
average, maximum, minimum…).
o Handle missing data: Explain and apply strategies such as removing or
replacing missing values, dropping rows with missing values...
o Group and Filter data: Perform simple grouping or filtering to focus on
specific subsets of the dataset (e.g., filter students who scored above 15).
3. Visualization Tasks:
o Create some visualizations using Matplotlib (at least 3): For example:
A bar chart to compare categories.
A line plot to show trends and a scatter plot to show correlations.
o Add basic customizations for the plots like titles, axis labels, and legends.
4. Optional OOP Implementation:
o Create a Python class with these simple methods:
load_data() for importing the dataset.
analyze_data() for generating basic statistics.
plot_data() for creating visualizations.
o This step is optional you can also use functions instead of OOP.
5. Teamwork and Project Management:
o Use Trello to manage tasks within your team (e.g., assign roles, set deadlines,
and track progress). Try to use labels and checklists to organize your work
better.
o Submit screenshots of the Trello board as part of the final report.
Deliverables:
1. Python Code:
o Submit well-structured Python scripts with clear comments.
o Include any additional relevant files used (e.g., CSV data files).
2. Report:
o Title Page: Include project title, team members, Professor and Module Name,
and submission date.
o Introduction: Explain the dataset and objectives.
o Methods: Describe data analysis and visualization steps.
o Results: Summarize findings and include visualizations.
o Challenges Faced: Describe any challenges encountered during the project
and how you overcame them.
o Conclusion: Highlight key findings and lessons learned.
o Submit your report as a PDF file.
o Your report will probably be about 5-7 pages of text long, but there are no
fixed upper or lower bounds on its size. You should write at an appropriate
length covering all important information.
3. Presentation Slides:
o Include slides for objectives, methodology, analysis results, visualizations,
and conclusions.
o You may use up to 10 slides and you are allowed 5-8 minutes to present your
work.
o Ensure that each team member contributes to the presentation.
Note: One group member should submit all the deliverables as a compressed file
(.rar or .zip) using the Classroom account.
Assessment Criteria:
Timeline:
Additional Notes:
Use datasets that are manageable in size (less than 10,000 rows is ideal) with a
simple structure.
Reach out for guidance at any stage of the project.