Ip Project Ansh
Ip Project Ansh
2024 - 25
PROJECT ON -
DATA ANALYSIS AND DATA VISUALISATION ON
ALCOHOL CONSUMPTION
Guided By - Submitted By -
Ms. Subhi Sharma Ansh Ochani
PGT (INFORMATICS PRACTICES) CLASS XII
Roll no. -
CERTIFICATE
This is to certify that Ansh Ochani of Class 12 has
successfully completed the project work on "Data
analysis and data visualization on Alcohol
Consumption" for subject Informatics Practices
(065) of class
I would also like to thank Alpine Academy for providing me with the
necessary resources and a conducive environment to carry out this
work.
Vinit Verma
CONTENTS
a) Introduction ....2
b) Purpose of the project ....3
c) Scope of the project ....4
d) Introduction to technology ....5-6
e) Hardware and software requirements ....7-8
f) Code ....9-22
g) Output ....23-27
h) Advantages of the project ....28
i) Limitations of the project ....29
j) Conclusion ....30
1<) Bibliography ....31
Page no -1
INTRODUCTION
_____________________________________________________
Alcohol Consumption
Who consumes the most alcohol? How has
consumption changed over time? And what are
the health impacts?
page no - 2
Purpose of the Project
The purpose of this project is to analyze and visualize data related to
alcohol consumption to gain a deeper understanding of trends,
patterns, and factors influencing drinking habits across different
demographics. By leveraging data analysis techniques, this project aims
to provide meaningful insights into the correlation between alcohol
consumption and various socio-economic, cultural, and geographical
factors.
page no - 3
Scope of the Project
page no - 4
INTRODUCTION TO TECHNOLOGY
-PYTHON:
➤ It is used for :-
web development (server-side),
software development,
mathematics,
system scripting.
➤ Why Python :-
Python works on different platforms (Windows, Mac, Linux, Raspberry Pi,
etc.).
page no - 5
MATPLOTLIB
NUMPY
NumPy is a Python library used for working with arrays. NumPy was
created in 2005 by Travis Oliphant. It is an open-source project and you
can use it freely. NumPy stands for Numerical Python. In Python we have
lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than
traditional Python lists. The array object in NumPy is called Nd array, it
provides a lot of supporting functions that make working with ND array
very easy.
page no - 6
Hardware and software requirements
• Hardware required:
Model: Dell Inspiron 3521
Category: Laptop computer
Processor: Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz 1.80
GHz
System type: 64-bit operating system, x64-based processor
Installed memory (RAM): 8.00 GB (7.89 GB usable)
• Software required
Window10 Pro
•Word
page no - 7
➤ Python
➤ Libraries of python:
✓ Matplotlib
✓ NumPy
✓ Pandas
page no - 8
CODE
import pandas as pd
mport matplotlib.pyplot as plt
print()
print()
print("#####################################
################ WELCOME
#########################################
################")
print("")
print("")
print("----------------------------------------------------------------------------
-------------------------------------------")
print("")
print(" .. TOPIC :ALCOHOL CONSUMPTION
..")
print("")
print("-----------------------------------------------------------------------
------------------------------------------------")
print("")
print("")
print(" SCHOOL : ALPINE ACADEMY") print(" NAME : ANSH
OCHANI AND VINIT VERMA")
print(" CLASS : XII 'PCM'")
print(" ROLL NO : ")
print(" SUBJECT CODE : ")
print(" ")
print(" AIM : Aim of the project is to take data stored in csv or
database file and analyze using python libraries \n and generate
appropriate charts to visualise ")
page no - 9
print("")
print("")
print("")
print("--------------------------------------------------------------------------
---------------------------------------------")
data = pd.read_csv("alcohol_consumption_2000_2020.csv")
data.fillna(0, inplace=True
data["Average_Consumption"] = data[year_columns].mean(axis=1)
def categorize(consumption):
if consumption < 2:
return "Low"
elif 2 <= consumption < 6:
return "Moderate"
else:
return "High"
data["Consumption_Category"] =data
["Average_Consumption"].apply(categorize)
page no - 10
# Group by Consumption Category for a summary
grouped_summary = data.groupby("Consumption_Category")
["Average_Consumption"].count()
print(grouped_summary)
#Group by Year
grouped_by_year = data[year_columns].mean()
print("\nAverage Alcohol Consumption per Year:")
print(grouped_by_year)
page no - 11
# User Input Analysis
# Check if the country exists in the dataset and the year is valid
consumption_value = country_data[year].values[0]
page no - 12
country_yearly = country_data[year_columns].values.flatten()
plt.figure(figsize=(10, 6))
plt.plot(year_columns, country_yearly, marker="o", color="blue",
label=country_name)
plt.legend()
plt.show()
else:
print("Invalid country name or year. Please try again.")
plt.figure(figsize=(10, 6))
grouped_by_year.plot(marker="o", color="orange")
plt.xlabel("Year")
plt.ylabel("Consumption (Liters per Capita)")
plt.grid(True)
plt.show()
page no - 13
# Save the categorized dataset to a new CSV file
output_file = "categorized_alcohol_consumption.csv"
data.to_csv(output_file, index=False)
print(f"\nCategorized dataset saved to: {output_file}")
page no - 14
# 3. Countries with the highest and lowest total alcohol consumption
# Sum the total consumption across all years for each country
data['Total_Consumption'] = data[year_columns].sum(axis=1)
max_total_consumption_country =
data.loc[data['Total_Consumption'].idxmax()]
min_total_consumption_country =
data.loc[data['Total_Consumption'].idxmin()]
page no - 15
# 4. Bar chart comparing the countries with the highest average
alcohol consumption
top_10_countries =
grouped_by_country.sort_values(by="Average_Consumption",
ascending=False).head(10)
plt.figure(figsize=(10, 6))
top_10_countries["Average_Consumption"].plot(kind="bar",
color="skyblue")
plt.title("Top 10 Countries with Highest Average Alcohol Consumption
(2000-2020)")
plt.xlabel("Country")
plt.ylabel("Average Consumption (Liters per Capita)")
plt.xticks(rotation=45, ha="right")
plt.grid(True)
plt.show()
plt.figure(figsize=(10, 6))
global_growth.plot(marker="o", color="green")
plt.title("Growth Rate in Alcohol Consumption (2000-2020)")
plt.xlabel("Year")
plt.ylabel("Growth Rate (%)")
plt.grid(True)
plt.show()
page no - 16
OUTPUT
INTRODUCTION :-
Summary Of Consumption :-
page no - 17
Average Consumption by Country (Top 20)
page no - 18
Average Consumption by a particular country in perticular
year(ex-India , 2006)
page no - 19
Graph of Alcohol Consumption By A Particular
Country (ex- India)
page no - 20
Graph of Top 10 Countries with Highest Average
Consumption
page no - 21
Advantages of the Project
This project provides meaningful insights into alcohol consumption
patterns and the factors influencing drinking behavior across
various demographics. By leveraging data analysis and visualization
techniques, it simplifies complex datasets into easily
understandable visuals, enabling a wider audience, including
policymakers, researchers, and the general public, to engage with
the findings. These insights can inform targeted interventions and
public health campaigns aimed at reducing alcohol-related harm
and promoting responsible consumption.
page no - 22
Limitations of the Project
page no - 23
Conclusion
This project successfully analyzes and visualizes alcohol
consumption data, providing valuable insights into trends
and factors influencing drinking behaviors across various
demographics. By leveraging data analysis techniques and
creating clear visual representations, the project offers
actionable insights for policymakers, public health
organizations, and researchers.
page no - 24
BIBLIOGRAPHY
https://ourworldindata.org
https:/www.google.com/%C2%A0
www.who.int/
page no - 25