Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
3 views
WeatherDataAnalysis
Mini project
Uploaded by
tanujashinde273
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save WeatherDataAnalysis For Later
Download
Save
Save WeatherDataAnalysis For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
0 ratings
0% found this document useful (0 votes)
3 views
WeatherDataAnalysis
Mini project
Uploaded by
tanujashinde273
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download now
Download
Save WeatherDataAnalysis For Later
Carousel Previous
Carousel Next
Save
Save WeatherDataAnalysis For Later
0%
0% found this document useful, undefined
0%
, undefined
Embed
Share
Print
Report
Download now
Download
You are on page 1
/ 17
Search
Fullscreen
DSBDA REPORT ON ”Weather Data Analysis SUBMITTED TO THE SAVITRIBAI PHULE PUNE UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE AWARD OF THE, BACHELOR OF IT ENGINEERING BY ‘Mr. Hitesh Gopal Patil (11902708552) Mr. Prathamesh Bajirao Chormale (11902708510) ‘Ms. Tanuja Babasaheb Shinde (11902708569) Mr. Devidas Kakasaheb Tambe (11902708573) UNDER THE GUIDANCE OF Ms. Shital S. Patil DEPARTMENT OF IT ENGINEERING Sir Visvesvaraya Institute Of Technology, Nashik A/p.Chincholi, Tal.Sinnar, Dist.Nashik - 422102(MS)India YEAR 2024-2025DEPARTMENT OF INFORMATION TECHNOLOGY Sir Visvesvaraya Institute Of Technology, Nashik A /p.Chincholi, Tal.Sinner, Dist.Nashik - 422102(MS)India Year 2024-25 we CERTIFICATE ‘This is to certify that DSBDAL report entitled “Weather Data Analysis” Is submitted as partial fulfillment of curriculum of the T.E. of IT Engineering BY ‘Mr. Hitesh Gopal Patil (11902708552) Mr. Prathamesh Bajirao Chormale (11902708510) Ms. Tanuja Babasaheb Shinde (T1902708569) ‘Mr. Devidas Kakasaheb Tambe (11902708573) (Ms. Shital S. Patil) (Dr-Pratibha.V.Kashid) Guide Head Of Department SVIT, NashikCertificate By Guide This is to certify that Mr. Hitesh Gopal Patil (11902708552) ‘Mr. Prathamesh Bajirao Chormale (11902708510) Ms. Tanuja Babasaheb Shinde (11902708569) Mr. Devidas Kakasaheb Tambe (11902708573) Has completed the DSBDA project under my guidance and that, I have verified the work for its originality in documentation, problem statement, literature survey and conclusion presented in DSBDA project . Place: Nashik (Ms. Shital S. Patil) Date:Acknowledgement Itis our immense pleasure to work on this project Weather Data Analysis. It is only the ble: ing of my divine master which has prompted and mentally equipped me to undergo the study of this project. We would like to thank Prof Dr.G.B.Shinde, Principal, Sir Visvesvarya Institute of Technology for giving me such an opportunity to develop practical knowledge about subject. We are also thankful to Dr.Pratibha.V.Kashid, Head of IT Engineering Department for his valuable encouragement at every phase of our project and completion. We offer our sincere thanks to our guide Ms, Shital S, Patil, who very a encourages We to work on the subject and gave his valuable guidance from time to time. While preparing this project we are very much thankful to him, We are also grateful to entire staff of IT Engineering Department for their kind co- operation who helped we in successful completion of project. SVIT, NASHIK. Mr. Hitesh Gopal Patil (11902708552) Mr. Prathamesh Bajirao Chormale (11902708510) Tanuja Babasaheb Shinde (1902708569) Mr. Devidas Kakasaheb Tambe (11902708573)INDEX SR.NO TITTLE PAGE NO. 1 Abstract 1 2 Introduction 2 3 Implementation 3 4 Conclusion 5ABSTRACT The aim of this project is to perform exploratory data analysis and predictive modeling on a weather dataset using Python, The dataset contains hourly weather records for the year 2012, including attributes such as temperature, humidity, wind speed, visibility, and atmospheric pressure. Through data preprocessing and visualization techniques, we uncover patterns, seasonal trends, and relationships among the variables. Additionally, a simple linear regression model is implemented to predict temperature based on selected features like humidity, wind speed, and pressure. The project highlights the importance of data-driven insights in understanding weather behavior and sets the foundation for building more accurate predictive systems in the future. ‘This project presents a comprehensive analysis of hourly weather data collected over the year 2012. The objective is to explore, understand, and predict weather patterns using data science tools and techniques. The dataset includes key weather parameters such as temperature, dew point, relative humidity, wind speed, visibility, and atmospheric pressure. ‘The analysis begins with data cleaning and preprocessing, followed by detailed exploratory data analysis (EDA) using visualizations like line graphs, scatter plots, histograms, and heatmaps. These visualizations help reveal trends such as seasonal temperature variation, the relationship between temperature and humidity, and correlations among various weather attributes.INTRODUCTION Weather has a significant impact on human life, affecting agriculture, transportation, health, and even the economy. With the growing availability of large weather datasets and powerful data analysis tools, it possible to understand and predict weather patterns using data science techniques. s now This project focuses on analyzing hourly weather data collected throughout the year 2012. The dataset includes various parameters such as temperature, dew point, humidity, wind speed, visibility, and atmospheric pressure. By performing exploratory data analysis (EDA), we aim to uncover meaningful patterns and relationships among these weather attributes. In addition to EDA, we also implement a basic machine learning model to predict temperature based on other environmental features. Python libraries like Pandas, Matplotlib, Seaborn, and Scikit-learn are used to handle data processing, visualization, and modeling. ‘The objective of this project is not only to gain insights from real-world weather data but also to apply fundamental data science techniques that are essential for solving practical problems.IMPLEMENTATION ‘The implementation of this project was carried out in Python using Jupyter Notebook. It involved multi steps including data loading, cleaning, analysis, visualization, and predictive modeling. Below is a detailed explanation of each phase: 1. Importing Required Libraries We started by importing essential libraries: + pandas and numpy for data manipulation, + matplotlib.pyplot and seaborn for data visualization, + scikit-leamn for building the machine learning model. 2. Loading and Exploring the Dataset ‘The dataset Weather Data.csv was loaded using Pandas. We used functions like .info(), -head(), and describe() to understand its structure and summary statistics. 3. Data Cleaning and Preprocessing + Checked for missing values and found none. + Removed any duplicate records. © Converted Date/Time column to datetime format and set it as the index for time-series analysis, 4, Data Visualization Various plots were created to analyze trends and relationships: + Line Plot: To visi * Histogram: To observe temperature distribution. + Heatmap: To understand correlation among numerical features. + Scatter Plot: To examine relationship between humidity and temperature. + Daily & Monthly Trends: Focused analysis on May Ist and monthly averages. lize temperature trends throughout the yea 5, Feature Engineering + Extracted the month from the datetime index for seasonal analysis.6. Machine Learning Model A Linear Regression model was implemented to predict temperature using: + Relative Humidity + Wind Speed + Pressure Steps: + Defined input (X) and output (y) features, + Split the dataset into training and testing sets. + Trained the model and evaluated its performance using R? score and Mean Squared Error (MSE). Results: + R®Score: 0.177 + Mean Squared Error: 119.12 This shows the linear model could partially explain the variation in temperature but could be improved with more features or complex models.CONCLUSION In this project, we successfully analyzed a real-world weather dataset using Python. By applying data cleaning, preprocessing, and visualization techniques, we were able to uncover meaningful insights about temperature trends, humidity levels, seasonal patterns, and the relationships between different weather parameters. We observed that temperature generally follows a seasonal trend and is influenced by factors like humidity and atmospheric pressure. The data visualizations helped us better understand these patterns Furthermore, we implemented a simple linear regression model to predict temperature using humidity, wind speed, and pressure as input features. Although the model provided a basic prediction, the R? score indicated that more complex models or additional data would be needed to improve accuracy. This project has strengthened our understanding of exploratory data analysis, time-series data handling, and regression modeling. It also demonstrates how data science techniques can be applied to gain valuable insights from environmental data, paving the way for more advanced forecasting systems in the future.import pandas as pd import nunpy as np import matplotlib.pyplot as plt import seaborn as sns GF = pd.read_csv( "Weather Data.cs) af ° 8779 8780 8781 8782 8783 8784 rows x 8 columns Date/Time Temp ¢ anjeore 0:00 anjeore 1:00 a2012 200 anjoiz 3:00 an2o12 4:00 12/31/2012 19:00 12/31/2012 20:00 12/31/2012 21:00 12/31/2012 22.00 12/31/2012 23:00 18 18 18 AS a5 a 02 00 Rel Hum_% 86 7 89 88 88 81 83 9B 89 86 Wind Speed_km/h 30 24 28 28 30 Visibility km Press kPa 80 80 40 40 48 97 97 48 97 101.24 101.24 101.26 101.27 101.23 100.13 100.03 99.95 99.91 99.89 ¢ df.info()
RangeIndex: 8784 entries, @ to 8783 Data columns (total 8 columns): # Colum Non-Null Count Dtype @ Date/Time 8784 non-null object 1 Temp_c 8784 non-null floatea 2 Dew Point Temp_C 8784 non-null floated 3 Rel Hum_% 8784 non-null int64 4 Wind Speed_km/h 8784 non-null int6a 5 6 Visibility km 8784 non-null floatea Press_kPa 8784 non-null floate4 7 Weather 8784 non-null object dtypes: floatea(4), int6a(2), object(2) memory usage: 549.1+ KB print (df.isnul1().sum()) Date/Time @ Temp_C Dew Point Tenp_c Rel Hum_% Wind Speed_km/h Visibility_km Press_kPa Weather dtype: intea Gf = df.drop_duplicates() df .describe() Dew Point Wind TempC “Temp.c RetHUM% soeed km/h Visibility km Press kPa count 8784,000000 8784.000000 8784,000000 8784.000000 8784.000000 8784,000000 mean 8798144 2.555294 67431694 14945469 27.664447 101.051623 std 11.687883 10883072 16918881 8.688696 12.622688 0.844005, min -23300000 -28,500000 18.000000 0.000000. 0.200000 97520000 25% 0.100000 5.900000 56000000 9.000000 24.100000 100560000 50% 9300000 + 3.300000 8.000000 + 13,000000 25.0000 101.070000 75% 18800000 11.80000081,000000 + 20,000000 2.000000 101590000 max 33,000000 24400000 100,.000000 + 83,000000 48300000 103.6500 df[ ‘Formatted Date'] = pd.to_datetime(df[ 'Date/Time' ]) dF. set_index( ‘Formatted Date’, inplace=True) afFormatted Date 2012-01- o1 00:00:00 2012-01- o1 01:00:00 2012-01- o1 02:00:00 2012-01- o1 (03:00:00 2012-01- o1 04:00:00 2012-12- 31 19:00:00 2012-12- 31 20:00:00 2012-12- 31 21:00:00 2012-12- 31 22:00:00 2012-12- 31 23:00:00 Date/Time Temp.C arj2012 0.00 qnj2012 1:00 anp2012 2:00 anor 3:00 anj2o1z 4:00 12/31/2012 19:00 12/31/2012 20:00 12312012 21:00 12/31/2012 22:00 12/31/2012 23:00 8784 rows x 8 columns 18 18 a5 o1 02 00 Point Temp_C 27 15 18 Rel Hum % 86 87 89 81 83 83 89 86 Wind Speed_km/h 30 24 28 28 30 Visibility km Press kPé 80 80 49 40 48 97 97 48 97 13 Deere plt.Figure(Figsize=(12,5)) plt.plot(df.index, df[‘Temp_c']) plt.title("Tenperature Over Time") plt.xlabel ("Date") plt.ylabel( “Temperature (C)") plt.grid() plt.show() 101.2 101.24 01.2 101.2; 101.2: 100.1 1000: 99,9: 99.9" 99.8In [12 ‘Temperature Over Time ‘empertire (€) ate numeric_df = df.select_dtypes(include=[‘float64', ‘int64']) plt.figure(Figsize=(10,6)) sns-heatmap(numeric_df.corr(), annot=True, cmap='coolwarm' ) plt.title("Correlation Heatmap") plt.show() Correlation Heatmas a 10 Temp.¢ os ew Point Temp.€ os -0a Rel Hum 36 -02 Wind speed kr . 00 siity_krn ~ 02 04 Press, kPa Temp. & g z 4 2 z z ew Point Temp.C Wind Speed kmh pit. Figure(Figsize=(8,5)) sns.histplot(df[‘Tenp_C'], kde=True, color="orange’) plt.title(‘ Temperature Distribution") plt.xlabel('Tenperature (C)') pit. ylabel (‘Frequency’) plt.grid() plt. show()n [14 ‘Temperature Distribution 400 Frequency § 200 100 0 10 ‘Temperature (C) pit. Figure (Figsize=(8,5)) sns.scatterplot (data=-df, x="Rel Hum_X', plt.title(*Humidity vs Temperature’) plt.grid() plt.show() ‘Temp_c*) Humidity vs Temperature 20 40 60. 80 100 Rel Hum_% # Get data for the entire da day_data = df-loc['2012-05-01"] Ast May 2012 plt. Figure(figsize=(12,5)) plt.plot(day_data.index, day data['Temp_c'], marker='0', colors" green’)1 plt.title(*Temperature Throughout the Day (1 May 2012)") plt.xlabel('Time') plt.ylabel( ‘Temperature (C)') pit. xticks(rotation=45) plt.grid() plt.show() “Temperature Throughout the Day (1 May 2012) ‘Temperate () * > 2 3 & a ra Ca & d#[ Month") = df.index.month monthly_avg = d¥.groupby("Nonth')[‘Tenp_C*}.mean() plt.Figure(Figsize-(10,5)) monthly _avg.plot(marker='0', color="purple’) plt.title(‘Monthly Average Temperature’ ) plt.xlabel (‘Month' ) plt.ylabel(‘Avg Temperature (C)') plt.grid() plt.show() Monthly Average Temperature 2 2 $0 2 Es 2 ° -s 3 7 3 3 % 2 Month from sklearn.model_selection import train_test_split from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error, r2_score17]: X = d[['Rel Hum%', ‘Wind Speed_km/h’, ‘Press kPa']] y = df{'Temp_c'] # split data X_train, X test, y train, y test = train_test_split(X, y, test_size=0.2, random # Train model model = LinearRegression() model. fit(X_train, y_train) # Predict and evaluate y_pred = model.predict(x_test) y_pred Ly) array([ 8.31174767, 10.49252381, 1.6905262 , ..., 9.3846832 , 13.71053101, 14.93376871]) 1s]: print ("R2 Score:", r2_score(y test, y pred) print("Mean Squared Error:", mean_squared_error(y test, y_pred)) R2 Score: @.17748486570306532 Mean Squared Error: 119.11967208953386
You might also like
CT127 3 2 Pfda NP000327
PDF
No ratings yet
CT127 3 2 Pfda NP000327
21 pages
Documentation_Weather_analysis_
PDF
No ratings yet
Documentation_Weather_analysis_
22 pages
M-R 1
PDF
No ratings yet
M-R 1
12 pages
Weather Data Analysis: Department of Computer Science & Engineering (DATA SCIENCE) Mini Project Presentation
PDF
No ratings yet
Weather Data Analysis: Department of Computer Science & Engineering (DATA SCIENCE) Mini Project Presentation
12 pages
AI Project
PDF
No ratings yet
AI Project
30 pages
Report
PDF
No ratings yet
Report
5 pages
DMW_Project
PDF
No ratings yet
DMW_Project
14 pages
agb13FMDP0623report
PDF
No ratings yet
agb13FMDP0623report
76 pages
IIT Madras project
PDF
No ratings yet
IIT Madras project
28 pages
Weather forecasting report_ai
PDF
No ratings yet
Weather forecasting report_ai
11 pages
Dma 89
PDF
No ratings yet
Dma 89
21 pages
VA Case Study 2
PDF
No ratings yet
VA Case Study 2
27 pages
BA Assignment - v4
PDF
No ratings yet
BA Assignment - v4
7 pages
Visiualizing Data of Weather Using Python
PDF
No ratings yet
Visiualizing Data of Weather Using Python
25 pages
BA Assignment_pdf_v4
PDF
No ratings yet
BA Assignment_pdf_v4
7 pages
Training Seminar
PDF
No ratings yet
Training Seminar
12 pages
Recurrent Neural Network-Programs
PDF
No ratings yet
Recurrent Neural Network-Programs
9 pages
PROJECT
PDF
No ratings yet
PROJECT
13 pages
Weather Analysis Final Presentation
PDF
No ratings yet
Weather Analysis Final Presentation
23 pages
capstone review 2
PDF
No ratings yet
capstone review 2
30 pages
Rainfall Prediction Project
PDF
No ratings yet
Rainfall Prediction Project
19 pages
weatherforecasting
PDF
No ratings yet
weatherforecasting
10 pages
Implement Classification and Time Series Analysis in Tensorflow (2)
PDF
No ratings yet
Implement Classification and Time Series Analysis in Tensorflow (2)
7 pages
Weather Forecasting
PDF
No ratings yet
Weather Forecasting
10 pages
Yesh Eda Merged
PDF
No ratings yet
Yesh Eda Merged
40 pages
Climate Change Modeling
PDF
No ratings yet
Climate Change Modeling
46 pages
LPG GAS LEAKING DETECTING ROBOT
PDF
No ratings yet
LPG GAS LEAKING DETECTING ROBOT
14 pages
Weather Forecasting Final Project Report
PDF
No ratings yet
Weather Forecasting Final Project Report
27 pages
Objective
PDF
No ratings yet
Objective
7 pages
Report Document
PDF
No ratings yet
Report Document
5 pages
Weather_Forecasting_documentation.docx
PDF
No ratings yet
Weather_Forecasting_documentation.docx
25 pages
Weather Patterns Analysis Presentation
PDF
No ratings yet
Weather Patterns Analysis Presentation
9 pages
Weather Analysis Project
PDF
No ratings yet
Weather Analysis Project
2 pages
ProjectReport
PDF
No ratings yet
ProjectReport
35 pages
Csi 5155 ML Project Report
PDF
100% (1)
Csi 5155 ML Project Report
24 pages
Weather Data Time Series Modelling Delhi-Kaggle Notebook - by Merve Gamze C. - Medium
PDF
No ratings yet
Weather Data Time Series Modelling Delhi-Kaggle Notebook - by Merve Gamze C. - Medium
18 pages
Weatherforecasting
PDF
No ratings yet
Weatherforecasting
10 pages
Weather Forecasting Report File 1
PDF
No ratings yet
Weather Forecasting Report File 1
33 pages
BA Assignment - PDF - v1
PDF
No ratings yet
BA Assignment - PDF - v1
6 pages
Weather Forecasting Using Decision Tree Regression
PDF
No ratings yet
Weather Forecasting Using Decision Tree Regression
7 pages
Tech Assessment For Data Scientists_Analyst.docx
PDF
No ratings yet
Tech Assessment For Data Scientists_Analyst.docx
2 pages
Weather Pattern Analysis and Prediction Chaman
PDF
No ratings yet
Weather Pattern Analysis and Prediction Chaman
18 pages
Performing Analysis of Meteorological Data: Punam Seal
PDF
No ratings yet
Performing Analysis of Meteorological Data: Punam Seal
21 pages
Bhardwaj_Eshta_2022_Masters
PDF
No ratings yet
Bhardwaj_Eshta_2022_Masters
206 pages
SYNOPSIS
PDF
No ratings yet
SYNOPSIS
4 pages
Mohit
PDF
No ratings yet
Mohit
12 pages
Weather Forecasting
PDF
No ratings yet
Weather Forecasting
12 pages
Project ML Last-Dark
PDF
No ratings yet
Project ML Last-Dark
28 pages
WEATHER FORECASTING-2
PDF
No ratings yet
WEATHER FORECASTING-2
16 pages
DTI
PDF
No ratings yet
DTI
8 pages
CSI5155 ML Project Report
PDF
No ratings yet
CSI5155 ML Project Report
23 pages
Weather_Analysis_Final_Presentation
PDF
No ratings yet
Weather_Analysis_Final_Presentation
23 pages
Weather-Station-Project-using-Engineering-Mathematics (1) BBB
PDF
No ratings yet
Weather-Station-Project-using-Engineering-Mathematics (1) BBB
8 pages
Synopsis P
PDF
100% (1)
Synopsis P
6 pages
Final Report 1301174460 1301174539 AMLdocx
PDF
No ratings yet
Final Report 1301174460 1301174539 AMLdocx
12 pages
10 1109@icesc48915 2020 9155571
PDF
No ratings yet
10 1109@icesc48915 2020 9155571
4 pages
DWDM Project
PDF
No ratings yet
DWDM Project
10 pages
HHHHHH
PDF
No ratings yet
HHHHHH
10 pages