0% found this document useful (0 votes)
17 views

Ml Final Report

The document is a project report on 'Calories Burnt Prediction' developed by students under the guidance of Dr. Janani M at SRM Institute of Science and Technology. It discusses the significance of accurately predicting caloric expenditure for fitness and health management, employing advanced machine learning models to analyze various physiological and contextual factors. The report outlines methodologies, literature surveys, and the development of an optimized prediction model, addressing challenges such as personalization and data privacy.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Ml Final Report

The document is a project report on 'Calories Burnt Prediction' developed by students under the guidance of Dr. Janani M at SRM Institute of Science and Technology. It discusses the significance of accurately predicting caloric expenditure for fitness and health management, employing advanced machine learning models to analyze various physiological and contextual factors. The report outlines methodologies, literature surveys, and the development of an optimized prediction model, addressing challenges such as personalization and data privacy.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

CALORIES BURNT PREDICTION

A PROJECT REPORT

21CSC305P – MACHINE LEARNING


(2021 Regulation)
III Year/ V Semester
Academic Year: 2024 -2025

Submitted by
MUTHUMANI J D [RA22110030112002]
LOKESHWARAN R [RA22110030112018]
GOKUL S [RA2211003011996]
VIKKESH P [RA2211003012005]

Under the Guidance of

Dr. Janani M
Assistant Professor
Department of Computing Technologies
In partial fulfillment of the requirements for the degree of

BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE ENGINEERING

SCHOOL OF COMPUTING
COLLEGE OF ENGINEERING AND TECHNOLOGY
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY
KATTANKULATHUR- 603 203
NOVEMBER 2024

i
SRM INSTITUTE OF SCIENCE AND TECHNOLOGY

KATTANKULATHUR – 603 203


BONAFIDE CERTIFICATE

Certified that 21CSC305P – MACHINE LEARNING report titled


“CALORIES BURNT PREDICTION” is the Bonafide work of Muthumani J
D [RA2211003012002], Lokeshwaran [RA2211003012018],Gokul S
[RA221100301160],Vikkesh P [RA2211003012005] who carried out the
project work under my supervision. Certified further, that to the best of my

knowledge the work reported herein does not form part of any other project

report or dissertation.

SIGNATURE SIGNATURE
Janani M Dr. Niranjana. G
Course Faculty Head of the Department
Assistant Professor Professor
Department of Computing Department of Computing
Technologies Technologies
SRM Institute of Science and SRM Institute of Science and
Technology Technology
Kattankulathur Kattankulathur

ii
ABSTRACT
The prediction of calories burnt has gained significant interest, particularly in
fitness, health monitoring, and medical fields. Accurately estimating calories
burnt helps individuals manage weight, optimize exercise routines, and maintain
overall health. Caloric expenditure prediction involves analyzing several
variables such as heart rate, activity type, duration, intensity, age, weight, and
gender. Advanced machine learning models, including regression techniques,
decision trees, and neural networks, offer promising accuracy in predicting
calories burnt by capturing complex relationships between these factors.
Wearable devices and fitness apps frequently incorporate these models,
providing real-time feedback and personalized insights to users. Additionally,
with the increasing availability of wearable sensor data, deep learning models
have emerged as powerful tools to further enhance prediction accuracy. Despite
advancements, challenges remain, such as handling personalized predictions,
accounting for metabolic variations, and ensuring data privacy. This study
focuses on developing an optimized calorie prediction model that leverages
physiological and contextual data to provide reliable calorie burn estimates,
catering to a wide range of user demographics and physical activity levels.

iii
TABLE OF CONTENTS

ABSTRACT iii
LIST OF FIGURES v
LIST OF TABLES vi

ABBREVIATIONS vii

1 INTRODUCTION 1
1.1 Accurate Caloric Expenditure Estimation 2

1.2 Machine Learning in Calorie Prediction 3

1.3 Software Requirements Specification 4

2 LITERATURE SURVEY 5

2.1 Models for Caloric Expenditure Estimation 6


2.2 Comparative study of Calories Burnt Prediction 7
3 METHODOLOGY 9
3.1 Data Preprocessing 9
3.1.1 Performing EDA 10
3.1.2 Feature Extraction 15
3.1.3 XGB_Regressor 17
4 RESULTS AND DISCUSSIONS 20
4.1 Model Performing Comparison 21
4.2 Comparative Analysis 22
4.3 Deployment Code 24
5 CONCLUSION AND FUTURE ENHANCEMENT 28
REFERENCES 32

LIST OF FIGURES

iv
Fig.No. Figure Name Page no.
3.1 Sample Dataset 17
3.2 Data Information 17
3.3 Age Vs Count 18
3.4 Height Vs Count 19
3.5 Weight Vs Count 20
3.6 Duration Vs Count 21
3.7 Heart_Rate Vs Count 22
3.8 Model Cross Validation Score 23
3.9 Correlation Matrix 24
3.10 Linear Regression 25
3.11 XGB regressor 26
5.1 Deployment 1 46
5.2 Deployment 2 47
5.3 Deployment 3 47

LIST OF TABLES

Table No. Title of Table Page no.

v
2.2 Compartitive Study of 14
Calories Burnt Prediction
4.1 Model Comparison 29

ABBREVIATIONS

BMI - Body Mass Index


RMSE - Root Mean Squared Error

R² - R-Squared (Coefficient of Determination)

vi
SVR - Support Vector Regression

LR - Linear Regression

RR - Ridge Regression

RF - Random Forest

ML - Machine Learning

API - Application Programming Interface


Flask - (No abbreviation; Flask is a web framework but often referred to by its
name alone)

CV - Cross-Validation

DB - Database

EDA - Exploratory Data Analysis

StdScaler - Standard Scaler (used for data normalization)

vii
CHAPTER 1

INTRODUCTION
Caloric expenditure prediction is a crucial component in the domains of
fitness, health monitoring, and lifestyle management, as it aids individuals and
healthcare providers in assessing energy output related to various activities.
This capability enables targeted improvements in exercise routines, weight
management plans, and even treatment protocols for metabolic conditions.
Estimating calories burnt requires a comprehensive understanding of
physiological and contextual factors, including an individual's basal metabolic
rate (BMR), activity intensity, heart rate, age, weight, gender, and even
environmental conditions.With the evolution of wearable technologies and the
proliferation of fitness tracking devices, there is a growing wealth of sensor
data available for analysis. This data, when combined with machine learning
algorithms, allows for sophisticated calorie burn prediction models that can
provide near real-time estimates of energy expenditure with significant
accuracy. learning approaches, particularly with access to extensive datasets,
are increasingly applied to enhance prediction precision by automatically
recognizing patterns and dependencies in the data.

1
1.1 Accurate Caloric Expenditure Estimation
Accurate caloric expenditure estimation is essential for individuals aiming to
achieve and maintain a healthy lifestyle. Caloric expenditure, or the number of
calories burned, reflects the energy cost of physical activities, ranging from
simple daily movements to structured exercise routines. Understanding this
expenditure is vital for various purposes, such as weight management, athletic
performance optimization, and general health monitoring. For individuals
engaged in fitness and weight loss programs, knowing the calories burned
allows them to balance calorie intake with expenditure, leading to effective
weight management. Furthermore, accurate calorie tracking is essential for
athletes who want to optimize their energy balance to enhance performance
and recovery.Several physiological and contextual factors influence caloric
expenditure, making precise estimation a challenging task. Key determinants
include age, weight, height, gender, heart rate, and body composition, all of
which contribute to an individual's basal metabolic rate (BMR) — the energy
expended while at rest. The intensity, type, and duration of physical activity
play a significant role in determining calorie burn. Machine learning models
are particularly advantageous in refining calorie predictions by identifying
patterns across large datasets. Techniques like regression analysis, decision
trees, and neural networks have been implemented to map relationships
between inputs like heart rate, age, weight, and activity type to accurately
predict calorie burn. As wearable devices continue to evolve and machine
learning models become increasingly sophisticated, the future holds promise
for even more accurate and individualized calorie expenditure predictions,
ultimately empowering users to make informed health and lifestyle decisions.

2
1.2 Machine Learning in Calorie Prediction
Machine learning plays a crucial role in advancing calorie prediction accuracy
by processing complex, multifaceted data related to caloric expenditure.
Traditional methods, while useful, often fail to account for individual variability
in factors like age, weight, heart rate, activity intensity, and metabolic rate.
Machine learning models, such as regression techniques, decision trees, and
neural networks, can analyze large datasets and recognize patterns across
multiple variables simultaneously, enabling personalized and real-time calorie
predictions. By leveraging continuous data from wearable devices, machine
learning models can adapt to users' unique physiological and lifestyle factors,
providing tailored insights that are far more accurate than conventional
estimation methods. These capabilities make machine learning a powerful tool
in fitness and health management, offering users a deeper understanding of their
energy expenditure and helping them achieve their health goals effectively.
Machine learning, on the other hand, leverages data-driven approaches to create
highly individualized predictions by integrating factors like age, weight, height,
activity type, heart rate, and other biometric data. This personalized and
adaptive approach to calorie prediction not only enhances accuracy but also
aligns closely with individual goals, such as weight management, athletic
performance, or general health improvement. As machine learning algorithms
continue to improve, they hold the potential to refine calorie prediction further,
making it more precise, contextaware, and ultimately beneficial for a wide range
of users

3
1.3 Software Requirements Specification
1.Purpose and Scope:The SRS begins with a clear description of the project’s
purpose and scope, establishing the primary goals and defining the boundaries of the
system. The purpose section specifies why the software is being developed,
identifying the main objectives and the intended benefits.

2.Functional Requirements:Functional requirements describe the core


functionalities and features that the software must provide. These requirements are
typically written in terms of "what" the system should do. For instance, in a calorie
prediction application, functional requirements might include data input features
(such as entering weight, height, age), integration with wearable devices, and
realtime calorie calculation.

3.Data Requirements and Database Specifications:Data requirements


describe the type, format, and structure of data the system will store, process,
or interact with. This includes specifications for any databases, data models,
and data flows within the application. The SRS should detail the fields
required for user data, activity logs, and any historical information needed for
accurate calorie predictions.

4.Interface Requirements:Interface requirements define how the software


will interact with users, other systems, or hardware devices. This includes the
user interface (UI) design specifications, specifying elements like screen
layouts, button placements, input forms, and navigation flow. It also covers
application programming. A Software Requirements Specification (SRS) is a
comprehensive document that outlines the functional and non-functional
requirements of a software project. It serves as a formal agreement between
the development.

4
CHAPTER 2

LITERATURE SURVEY

The literature on caloric expenditure prediction highlights a growing interest in


leveraging advanced computational techniques to enhance the accuracy and
personalization of calorie estimations. Early studies primarily focused on
traditional methods, such as the Harris-Benedict and Mifflin-St Jeor equations,
which provided foundational models for calculating basal metabolic rates based
on age, gender, weight, and height. However, as technology evolved, research
increasingly emphasized the use of machine learning algorithms and wearable
devices to refine these predictions. Notable works in this area have explored the
integration of heart rate monitoring, activity recognition, and physiological data
to develop predictive models that adapt to individual users' metabolic responses
and activity levels. The application of machine learning (ML) in calorie
prediction has garnered significant attention in recent years, driven by the
increasing demand for personalized health and fitness solutions. Various studies
have explored the efficacy of different ML algorithms in estimating caloric
expenditure based on physiological and contextual data. For instance, research
has demonstrated that regression models, including linear regression and
support vector machines, can effectively predict calorie burn from heart rate and
activity type data. Recent studies exploring machine learning techniques,
including regression models, decision trees, and neural networks, are examined,
showcasing their potential to provide more accurate and individualized
predictions by leveraging vast datasets and real-time physiological inputs.

5
2.1 Models for Caloric Expenditure Estimation
The estimation of caloric expenditure, or the number of
calories burned during various activities, is critical for individuals seeking to
manage weight, optimize fitness routines, and monitor overall health.
Numerous models have been developed to predict caloric burn, each
employing different methodologies and relying on various input factors. This
section delves into several prominent models for caloric expenditure
estimation, including traditional predictive equations, heart rate monitoring
techniques, and advanced machine learning algorithms.

1. Traditional Predictive Equations

Traditional methods for estimating caloric expenditure typically utilize


predictive equations that calculate Basal Metabolic Rate (BMR) and total daily
energy expenditure (TDEE). BMR represents the energy expended while at
rest, and it serves as a foundation for estimating caloric needs.

2.Physiological Models

Physiological models of caloric expenditure estimation have emerged to


address the limitations of traditional methods by incorporating physiological
parameters that influence energy expenditure. These models often utilize heart
rate (HR) monitoring as a proxy for oxygen consumption (VO2), which is a
reliable indicator of caloric burn during physical activity.One widely
recognized model is the Heart Rate Reserve (HRR) method, which calculates
caloric expenditure by assessing the relationship between heart rate and
oxygen uptake. This approach operates on the premise that heart rate increases
with physical exertion and that higher heart rates correspond to higher caloric
burn.

6
2.2 Comparative study of Calories Burnt Prediction
Method/Mo Description Strengths Weaknesses
del
Predictive Traditional Simple to use, requires Limited accuracy
Equations equations that minimal for individuals,
estimate BMRand data, well- not tailored to
TDEE based on established activity type.
demographic
factors.

Heart Rate Estimates calories Real-time May not account


Monitoring burned based on monitoring,correl ates for individual
heart rate data well with differences, can be
during activities. energy. influenced by
factors like
hydration and
fatigue.

Regression Statistical models Can be more accurate Still may struggle


Models that predict caloric than predictive with nonlinear
burn using multiple equations, can relationships,
variables (e.g., incorporate multiple requires data
heart rate, age, factors. for training.
weight).

Decision Trees A non-linear model Intuitive Prone to


that splits data into representation, overfitting,
branches based on handles non- requires a robust
decision rules linear relationships dataset for
derived from input well. training.
features.

7
Neural Deep learning Highly accurate, Requires
Networks models that learn capable of handling significant
complex patterns in vast amounts of data computational
large datasets to and capturing resources, data-
predict calories nonlinear relationships. intensive, can be a
burned. "black box" for
interpretation.

Wearable Devices that collect Convenient,userfriendly Variability in


Technology real-time biometric , provides continuous device accuracy,
data (e.g., heart rate, monitoring. potential privacy
movement) to concerns.
estimate caloric
burn.

The comparative study of various methods for predicting calories burnt


highlights the strengths and weaknesses of each approach. The table below
summarizes key attributes of traditional predictive equations, heart rate
monitoring techniques.Acomparative study table that outlines different models
and methods used for calories burnt prediction. The table includes key attributes
for each method, highlighting their strengths and weaknesses.

CHAPTER 3
8
METHODOLOGY OF CALORIES BURNT PREDICTION
The methodology for predicting calories burnt typically
involves a multi-step approach that integrates data collection, model selection, and
validation. The first step in the process is data collection, which can be sourced
from various inputs, including user-provided demographic information (such as
age, weight, height, and gender), physiological measurements (like heart rate and
activity type), and data gathered from wearable devices that monitor real-time
activity levels. Once the data is collected, it undergoes preprocessing to clean and
normalize the dataset, ensuring consistency and accuracy for analysis. Traditional
approaches may utilize predictive equations to estimate Basal Metabolic Rate
(BMR) and Total Daily Energy Expenditure (TDEE), while advanced methods may
incorporate machine learning techniques, such as regression analysis, decision
trees, or neural networks, to capture complex relationships in the data.

3.1 DATA PREPROCESSING

Data preprocessing is a critical step in the calories burnt


prediction methodology, ensuring that the data used for model training and
evaluation is clean, consistent, and suitable for analysis. The quality of the input
data directly affects the accuracy of the prediction models, making preprocessing
essential for successful outcomes.Numerical features, including age and Body
Mass Index (BMI), undergo normalization or standardization to bring them to a
comparable scale, thereby enhancing the model’s performance. Additionally, we
conduct exploratory data analysis to identify and select the most relevant features
that significantly impact insurance costs, applying techniques such as correlation
analysis or recursive feature elimination. This comprehensive preprocessing
ensures that the dataset is well-prepared, allowing the subsequent machine learning
models to deliver accurate and reliable predictions.

9
Fig 3.1 Sample Dataset

Fig 3.2 Data information

3.1.1 PERFORMING EDA


Exploratory Data Analysis (EDA) is a crucial step in the data
analysis process that involves summarizing and visualizing the main
characteristics of a dataset before applying any predictive modeling techniques. In
the context of calories burnt prediction, EDA helps to uncover patterns, spot
anomalies, test hypotheses, and validate assumptions. Next, correlation analysis is
performed to identify relationships between different variables, helping to
understand how factors like heart rate and activity duration impact caloric
expenditure. Heatmaps can be particularly effective in visualizing these
correlations, highlighting which features may have the strongest influence on the
target variable—calories burnt.
EDA also involves assessing missing data patterns and determining if certain
demographic groups or activity types are underrepresented in the dataset.

10
Age Vs Count
The analysis of "Age vs. Count" in the context of calories burnt prediction
involves examining the distribution of individuals across various age groups and
their corresponding counts within the dataset. This analysis is crucial for
understanding demographic trends and how age influences physical activity
patterns and caloric expenditure. Typically, data visualization techniques such as
bar charts or histograms are employed to illustrate the number of participants in
each age category. It is commonly recognized that metabolic rates tend to decline
with age, which may affect the total caloric expenditure during physical activities.
As a result, individuals in older age brackets might burn fewer calories for the
same level of exertion compared to younger individuals. By investigating these
age-related patterns, researchers can better tailor caloric burn prediction models to
accommodate the varying metabolic rates and activity levels across different age
groups. ultimately leading to more accurate and personalized calorie predictions.

Fig 3.3 Age vs Charge

Height Vs Count
The "Height vs. Count" analysis serves as an essential aspect of understanding the
demographics of the dataset in calories burnt prediction. By examining the
distribution of participants across various height categories, this analysis can reveal
patterns that may influence caloric expenditure. This analysis not only provides
11
insight into the physical diversity of the dataset but also highlights potential
relationships between height and caloric expenditure. Taller individuals typically
have a higher basal metabolic rate (BMR) due to greater body mass and surface
area, which can lead to increased caloric burn during physical activities.
Conversely, shorter individuals might exhibit lower caloric expenditure for the
same activity level. By integrating height-related insights, the prediction models
can become more precise, offering better personalized caloric expenditure
estimates tailored to individuals' unique physical characteristics.

Fig 3.4 height Vs Count

Weight Vs Count
The "Weight vs. Count" analysis is a pivotal component in understanding
how body weight distribution impacts caloric expenditure within the dataset for
calories burnt prediction. This analysis typically employs visual representations,
such as histograms or bar charts, to illustrate the frequency of individuals across
various weight categories. Conversely, underrepresentation of certain groups,
such as individuals with lower body weight, may lead to less accurate predictions
12
for those demographics.In addition to frequency counts, the analysis may also
explore how weight correlates with other variables, such as height and age, to
understand the broader context of caloric expenditure.

Fig 3.5 Weight Vs Count

Duration Vs Count
The length of time you engage in physical activity directly impacts the total
calories burned. For longer sessions, the body can tap into fat stores after initial
glycogen depletion, particularly during moderate-intensity activities. Activities like
running, cycling, or swimming burn more calories over extended durations,
making duration-based workouts ideal for endurance and steady calorie burning.
On the other hand, the count or frequency of workout sessions throughout the week
can also play a critical role. Shorter, high-intensity workouts done more frequently

13
can add up to a high calorie burn over time. The optimal balance between duration
and count varies based on individual goals. The duration of the workouts is spread
unevenly with high spikes in 6, 12, 18, 25 min ranges making it a good factor for
predicting calories.

Fig 3.6 Duration Vs Count

Heart_Rate Vs Count
The "Heart Rate vs. Count" analysis plays a crucial role in understanding how
heart rate data is distributed among individuals and its significance in predicting
caloric expenditure. By utilizing visual tools such as histograms or line charts, this
analysis can effectively illustrate the frequency of heart rate measurements within
specific ranges during various activities. Typically, one might observe a normal
distribution where most individuals fall within a certain heart rate range, while
extremes may indicate either very low activity levels or very high-intensity efforts.
This analysis is particularly important as heart rate is a key physiological indicator

14
Fig 3.7 Heart_Rate Vs Count

3.1.2 Feature Extraction

Model Cross Validation


Model cross-validation is a critical process in the development of predictive
models for calories burnt, as it ensures that the model's performance is accurately
assessed and can generalize well to unseen data. The primary goal of
crossvalidation is to evaluate how the results of a statistical analysis will generalize
to an independent dataset, thereby preventing issues such as overfitting, where a
model learns the noise in the training data rather than the underlying pattern.In
calories burnt prediction, the most commonly used method of cross-validation is k-

15
fold cross-validation. Cross-validation can help identify any potential biases in the
model by revealing how performance varies across different segments of the
dataset. The model's strengths and weaknesses, cross-validation assists in refining
feature selection and model tuning, ultimately leading to better prediction
accuracy.

Fig 3.8 Model Cross Validation Score

Correlation Matrix
A correlation matrix is a powerful tool used to assess the relationships between
multiple variables in a dataset, providing insights into how features interact with
one another. In the context of calories burnt prediction, a correlation matrix
allows researchers to examine the strength and direction of relationships
between various factors, such as demographic data, physiological metrics, and
activity levels. Each cell in the matrix represents the correlation coefficient
between two variables, typically ranging from -1 to +1. A value closer to 1
indicates a strong positive correlation, meaning that as one variable increases,
the other tends to increase as well. This situation can complicate the modeling

16
process by making it difficult to ascertain the individual contribution of
correlated features to the target variable—calories burnt.

Fig 3.9 Corrlation Matrix


3.2 Algorithms

Linear Regression
Linear regression is a fundamental machine learning technique widely used in
the field of predictive modeling, including calories burnt prediction. This
method establishes a linear relationship between the dependent variable, which
in this case is the total calories burnt, and one or more independent variables,
such as heart rate, weight, duration of activity, and age. By fitting a linear
equation to the observed data, linear regression enables researchers to
understand how changes in predictor variables influence caloric expenditure. We
evaluated the model's performance using metrics such as Mean Squared Error
(MSE) and R-squared, which helped us assess how well the model explained the
variability in insurance charges. While linear regression serves as a useful
17
baseline model, we also recognized its limitations, particularly in capturing
complex non-linear relationships, which informed our exploration of more
advanced machine learning techniques in subsequent phases of the project.

Fig 3.10 Linear Regression

XGBregressor

XGBRegressor is a powerful machine learning model based on the XGBoost


(Extreme Gradient Boosting) algorithm, specifically designed for regression
tasks. It has gained popularity in the data science community due to its superior
performance and speed compared to traditional regression models.
XGBRegressor incorporates advanced regularization techniques, which help
prevent overfitting—a common issue in machine learning where a model
performs well on training data but poorly on unseen data. By penalizing overly
complex models, XGBRegressor maintains generalization capability, making it
effective across diverse populations and activity levels. Its ability to
automatically handle missing data and its flexibility in feature engineering
further enhance its usability. employing XGBRegressor for calories burnt
prediction involves tuning various hyperparameters, such as learning rate, max
depth of trees, and the number of estimators, to optimize model performance.
18
Evaluating the model through techniques such as cross-validation provides a
robust assessment of its predictive accuracy. Overall, XGBRegressor represents
a powerful tool in the arsenal of machine learning techniques, enabling
researchers and practitioners to develop accurate, reliable, and efficient models
for estimating caloric expenditure based on a multitude of influencing factors.

Fig 3.11 XGBregressor

19
CHAPTER 4

RESULT AND DISCUSSION

The results and discussion section provides a comprehensive evaluation of the


predictive models developed for estimating calories burnt, focusing on the
performance metrics, insights gained from the analysis, and implications for
practical applications. In this study, various machine learning models, including
Linear Regression, Random Forest, and XGBRegressor, were employed to
predict caloric expenditure based on multiple features such as heart rate, weight,
age, duration of activity, and activity type. The models were evaluated using
standard metrics such as Mean Absolute Error (MAE), Root Mean Square Error
(RMSE), and R-squared values to assess their accuracy and reliability.The
findings revealed that XGBRegressor outperformed the other models, achieving
the lowest RMSE and highest R-squared value, indicating its superior ability to
capture the complexities of the dataset. This performance can be attributed to
XGBoost's ensemble learning approach, which effectively combines multiple
decision trees to reduce bias and variance. Additionally, feature importance
analysis highlighted that heart rate and weight were among the most significant
predictors of caloric expenditure, confirming existing literature that emphasizes
the role of these physiological factors in determining energy expenditure during
physical activities.The results also indicated some interesting trends, such as
variations in caloric burn across different age groups and activity types. For
instance, younger individuals engaging in high-intensity workouts demonstrated
significantly higher caloric burn compared to older individuals participating in
moderate activities.These findings have practical implications for health and
fitness professionals, as they can leverage these predictive models to create
personalized exercise programs and dietary plans. Future work may focus on
enhancing model accuracy by exploring additional data sources, incorporating

20
real-time monitoring through wearable technology, and addressing the nuances
of individual variability in metabolic rates.

4.1 Model Performance Comparison

The comparison of model performance is a critical aspect of evaluating the


effectiveness of different predictive algorithms in estimating calories burnt. In
this study, several machine learning models, including Linear Regression,
Random Forest, and XGBRegressor, were employed and compared based on
their predictive accuracy, computational efficiency, and robustness. In contrast,
the Linear Regression model, while simple and interpretable, exhibited higher
prediction errors with an RMSE of A (insert value) and an R² value of B (insert
value). This outcome is not entirely surprising, as linear regression assumes a
linear relationship between the predictors and the target variable, which may not
adequately capture the complexities of caloric expenditure influenced by
various physiological and behavioral factors. While each model has its strengths
and limitations, the results underscore the importance of selecting appropriate
algorithms based on the specific characteristics of the data and the objectives of
the analysis. A simple model assuming a linear relationship between input
features and calories burned. Compare each model's predictions on the test set
using the evaluation metrics. Lower MAE and RMSE values generally indicate
better performance.It using tree-based models to understand which factors most
influence calorie burn prediction. Standardize or Normalize inputs to improve
model performance. The quality of data inputs (e.g., heart rate, activity type, and
duration) greatly affects prediction reliability. High-accuracy models often
employ advanced machine learning algorithms, such as neural networks or
ensemble methods, trained on extensive datasets that capture diverse
demographic and physiological characteristics.

Table 4.1 Model Comparsion

21
4.2 Comparative Analysis

In our comparative analysis of model performances for predicting calories


burnt, several key findings emerged that highlight the strengths and limitations
of different machine learning algorithms.

1.Performance Metrics: The XGBRegressor consistently outperformed other


models, achieving the lowest Mean Absolute Error (MAE) and Root Mean
Square Error (RMSE). This indicates its superior ability to accurately estimate
caloric expenditure compared to Linear Regression and Random Forest.

2.Handling Non-Linearity: While Linear Regression provided a quick and


straightforward approach, it struggled to capture the non-linear relationships
inherent in the dataset. In contrast, both Random Forest and XGBRegressor
effectively modeled these complexities, leading to more reliable predictions.

3.Feature Importance: The analysis revealed that specific features, such as


heart rate and weight, were significant predictors of calories burnt.
XGBRegressor’s ability to provide insights into feature importance helped
refine the understanding of which variables had the most substantial impact on
caloric expenditure.

22
4.Overfitting Prevention: XGBRegressor exhibited enhanced robustness due to
its built-in regularization techniques, which helped prevent overfitting. This was
particularly evident in scenarios with diverse data points, where the model
maintained accuracy across various demographic segments.

5.Computational Efficiency: Although Linear Regression was the fastest in


terms of training and prediction times, the superior accuracy of XGBRegressor
justified its longer computational requirements, especially in applications where
precision is critical.

6. Overfitting and Model Robustness: XGBRegressor excelled in avoiding


overfitting, largely due to its regularization techniques, which make it more
suitable for real-world applications with diverse data. Random Forest also
demonstrated good robustness, while Linear Regression showed a higher
likelihood of overfitting when applied to more varied or larger datasets.

7. Generalization Across Populations: Random Forest demonstrated good


generalization abilities across different subsets of data, making it a reliable
model when applied to diverse user profiles. However, XGBRegressors
combination of accuracy and generalization ability still positioned it as the
superior model for this specific task.

8. Model Accuracy: Among the models tested, XGBRegressor showed the


highest accuracy in predicting calories burnt, achieving the lowest Mean
Absolute Error (MAE) and Root Mean Square Error (RMSE). This indicates its
ability to capture the complex relationships between activity parameters and
caloric expenditure more effectively than the other models.

In this project, a comparative analysis of various machine learning models


Linear Regression, Random Forest, and XGBRegressorwas conducted to
determine their effectiveness in predicting calories burnt.

Key Insights:

23
● The comparatively simple Random Forest model outperformed more
complex algorithms, highlighting the effectiveness of straightforward
approaches in certain scenarios.
● The performance gap between the best (0.90) and worst (0.79) models
was only 11%, indicating a relatively narrow range of accuracy across the
evaluated algorithms.
● Traditional models, particularly Random Forest demonstrated superior
performance compared to the ensemble methods in our analysis,
suggesting that the nature of the dataset may have influenced the
effectiveness of these more complex algorithms.

4.3 Deployment Code Templates

import gradio as gr import numpy as

np import pandas as pd from

xgboost import XGBRegressor

# Load your model and data (make sure to adjust the file paths) class

CaloriePredictor:

def __init__(self):

self.workout_csv_path = "workout.csv" # Adjust this path accordingly

# Load workout data

workout_df = pd.read_csv(self.workout_csv_path)

workout_df.drop(columns=workout_df.columns[0], axis=1, inplace=True

# Prepare data for model training

X = workout_df.drop(columns=["Calories", "User_ID"])

24
Y = workout_df[["Calories"]]

# Train model

self.XGBR_model = XGBRegressor().fit(X, Y)

def predict_calories(self, gender, age, height, weight, duration, heart_rate,


body_temp):

gender_numeric = 0 if gender == "M" else 1 # Convert gender to numeric

# Prepare input for prediction

input_data = np.array([[gender_numeric, age, height, weight, duration,


heart_rate, body_temp]])

# Predict calories burned prediction =

self.XGBR_model.predict(input_data) return

int(prediction[0]) def recommend_diet(self, goal):

if goal == "Losing Weight":

return {

"Breakfast": "Oats with skim milk and fruits",

"Lunch": "Grilled chicken salad with mixed vegetables",

"Dinner": "Dal (lentils) with brown rice and steamed broccoli"

elif goal == "Gaining Weight":

return {

"Breakfast": "Paneer paratha with yogurt",

"Lunch": "Chicken biryani with raita",

"Dinner": "Paneer butter masala with naan"

else:

return {
25
"Breakfast": "Please select a valid goal.",

"Lunch": "",

"Dinner": ""

# Function to calculate calories burned def


calculate_calories_burned(weight_kg, duration_minutes, activity):

# Create an instance of the predictor calorie_predictor

= CaloriePredictor()

# Define Gradio interface

def gradio_interface(gender, age, height, weight, duration, heart_rate,


body_temp, goal):

# Calculate BMI height_m = height / 100 #

Convert cm to meters bmi = weight / (height_m

** 2)

# Predict calories burned

calories_burned = calorie_predictor.predict_calories(gender, age, height,


weight, duration, heart_rate, body_temp) diet_recommendation

= calorie_predictor.recommend_diet(goal) return (f"Your BMI

is: {bmi:.2f}\n"

f"The predicted calories burned are: {calories_burned}\n\n"

f"**Diet Recommendations:**\n" f"- **Breakfast:**

{diet_recommendation['Breakfast']}\n" f"- **Lunch:**

{diet_recommendation['Lunch']}\n" f"- **Dinner:**

{diet_recommendation['Dinner']}")

# Convert duration to hours duration_hours = duration_minutes / 60

26
# Get the MET value for the selected activity met =
met_values.get(activity.lower()) if met is None: return "Activity not found.
Please enter a valid activity."

# Calculate calories burned calories_burned = met * weight_kg *

duration_hours return calories_burned # Create inputs and outputs for the

Gradio app inputs = [ gr.Radio(["M", "F"], label="Gender"),

gr.Slider(minimum=0, maximum=100, label="Age"),

gr.Slider(minimum=100, maximum=250, label="Height (cm)"),

gr.Slider(minimum=30, maximum=200, label="Weight (kg)"),

gr.Slider(minimum=1, maximum=180, label="Duration (minutes)"),

gr.Slider(minimum=40, maximum=200, label="Average Heart Rate"),

gr.Slider(minimum=30.0, maximum=42.0, label="Body Temperature (°C)"),

gr.Radio(["Losing Weight", "Gaining Weight"], label="Goal")

output = gr.Textbox(label="BMI and Calories Burned with Diet


Recommendation")

calories_burned = calculate_calories_burned(weight_kg, duration_minutes,


activity) print(f"Calories burned: {calories_burned:.2f}")

# Launch the Gradio app gr.Interface(fn=gradio_interface,

inputs=inputs,

outputs=output,

title="Calorie Tracker with BMI and Diet Recommendations",

description="Enter your details to calculate your BMI and predict


calories burned during a workout. Receive diet recommendations for breakfast,
lunch, and dinner based on your goals.

27
CHAPTER 5

CONCLUSION AND FUTURE ENHANCEMENT


In conclusion, the prediction of calories burnt through various machine learning
models has demonstrated significant potential in both fitness and health
applications. The comparative analysis of different algorithms, including Linear
Regression, Random Forest, and XGBRegressor, highlighted the importance of
selecting the right model based on accuracy, robustness, and ability to handle
nonlinear relationships This project aimed to develop a robust and accurate
model for predicting calories burnt during various physical activities using
machine learning techniques. Through an extensive analysis of different
algorithms, including Linear Regression, Random Forest, and XGBRegressor,
we were able to determine that the XGBRegressor provided the most accurate
predictions, effectively capturing the complexities of the relationship between
physical activity parameters and caloric expenditure.The findings highlight the
importance of selecting appropriate features, such as weight, duration of
exercise, and MET values, which significantly influence the accuracy of the
predictions. The project's results underscore the potential for machine learning
to enhance personalized health and fitness strategies, enabling users to make
informed decisions regarding their caloric intake and physical activity levels.

Future Enhancement
The incorporation of real-time data from wearable devices could improve the
accuracy of predictions. By leveraging heart rate, movement patterns, and
metabolic data collected during activities, models could be fine-tuned for
individual users. Development of Hybrid Models: Future research could focus
on developing hybrid models that combine the strengths of various machine
learning techniques. By integrating ensemble methods or neural networks with
traditional algorithms, predictions could be enhanced further, providing more
accurate and reliable results. Expanding the feature set to include variables such
as age, gender, and activity type could provide a more holistic view of caloric
28
expenditure. These demographic factors can significantly influence metabolism
and should be considered in any predictive model. Future enhancements could
also focus on creating user-friendly applications that allow individuals to input
their data and receive real-time caloric burn estimates. This could include
mobile applications that utilize machine learning models to provide immediate
feedback during workouts.

29
DEPLOYMENT

Fig 5.1 Deployment 1

Fig 5.2 Deployment

30
Fig 5.3 Deployment

31
REFERENCES

[1] Goukens, Caroline, and Anne Kathrin Klesse. "Internal and external
forces that prevent (vs. Facilitate) healthy eating: Review and outlook within
consumer Psychology." Current Opinion in Psychology (2022): 101328.

[2] Khan, Abdul Wahid, et al. "Factors Affecting Fitness Motivation: An


Exploratory Mixed Method Study." IUP Journal of Marketing Management
21.2
(2022).https://www.medicalnewstoday.com/articles/319731

[3] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5496172/

[4] Roberts, K. C., Shields, M., de Groh, M., Aziz, A., & Gilbert, J. A.
(2012).
Overweight and obesity in children and adolescents: results from the 2009 to
2011 Canadian Health Measures Survey. Health rep, 23(3), 37-41.

[5] Kalpesh, Jadhav, et al. "Human Physical Activities Based Calorie Burn
Calculator Using LSTM." Intelligent Cyber Physical Systems and Internet of
Things: ICoICI 2022. Cham: Springer International Publishing, 2023. 405-
424.

[6] Tayade, Akshit Rajesh, and Hadi Safari Katesari. "A Statistical Analysis
to Develop Machine Learning Models: Prediction of User Diet Type."

[7] Gour, Sanjay, et al. "A Machine Learning Approach for Heart Attack
Prediction." Intelligent Sustainable Systems: Selected Papers of WorldS4
2021, Volume 1. Springer Singapore, 2022.

32
33

You might also like