0% found this document useful (0 votes)

69 views28 pages

BTP Report Final 1

This document discusses using machine learning models to enhance sales of e-learning platforms. It first discusses previous work done including data preprocessing, visualization, and using models like logistic regression, KNN, SVM, naive bayes, and random forest for prediction. It then discusses the machine learning models in more detail and evaluates their performance using metrics like accuracy, precision, recall and the confusion matrix. The goal is to predict the most potential customers to focus sales efforts on in order to increase conversion rates.

Uploaded by

Depepanshu Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views28 pages

BTP Report Final 1

Uploaded by

Depepanshu Mahajan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

1

Enhancing Sales of E-Learning Platforms Using

Machine Learning

A Project Report To Be Submitted To Indian Institute of Technology (BHU), Varanasi In

P Partial Fulfilment Of The Requirements For The Award Of The Degree Of

BACHELOR OF TECHNOLOGY
IN
MECHANICAL ENGINEERING

Submitted By

KAPIL KUMAR SHARMA 17135048

PIYUSH AGRAWAL 17135061
PRIYAM GARG 17135065
SHUBHAM RAJ 17135083
VAIBHAV 17134024

Under The Esteem Guidance Of

Prof. Anil Kumar Agrawal
Department of Mechanical Engineering
Indian Institute of Technology (BHU), Varanasi
2

ACKNOWLEDGEMENT

It is our great fortune to have had the opportunity to work on this exciting and
thought- provoking project in this institute. The learning and experience we have
received here is of inexplicable value to us. Gratitude is one of the deepest
expressions of one’s heart. So, it gives us immense pleasure to express our
paramount gratitude to each one of those who made this possible.

We would like to express our deep sense of gratitude to Dr. Anil Kumar Agrawal,
Professor, IIT (BHU) for providing us this unique opportunity of carrying out the
project and for his constant guidance, encouragement and timely support throughout
the course of this project.
3

TABLE OF CONTENTS

1.ACKNOWLEDGEMENT 2
2.INTRODUCTION 4
3.PREVIOUS WORK 5
4.MACHINE LEARNING MODELS 7
5.PERFORMANCE METRICS 10
6.FEATURE EXTRACTION 12
7.NEURAL NETWORKS 15
8.COURSE RECOMMENDER SYSTEM 22
9.CONCLUSION 28
4

INTRODUCTION
In today’s world where time is a commodity in sales .One of the most critical
business decisions of a company is related to customer acquisition. During the
acquisition phase of the customer life cycle, companies try to convert leads into
customers through different methods. One useful way to save time while increasing
sales is to make sure that we focus on the best available leads and not waste time on
inactive leads. In order to increase the conversion rate we used the customer specific
data like time spent on a website, total visits, occupation etc. and the customers are
then ranked in order of the probability of conversion , which are then pursued by
sales people. In order to achieve this goal, real world data is utilized as input to
various machine learning models which predicts the probability of conversion using
the important features of the data.

A recommendation engine filters the data using different algorithms and

recommends the most relevant items to users. It first captures the past behavior of a
customer and based on that, recommends products which the users might be likely
to buy. If a completely new user visits an e-commerce site, that site will not have
any past history of that user. So how does the site go about recommending products
to the user in such a scenario? One possible solution could be to recommend the best
selling products, i.e. the products which are high in demand. Another possible
solution could be to recommend the products which would bring the maximum profit
to the business.
5

PREVIOUS WORK

Our aim of the project is to help e-learning platforms to increase their sales and
customer conversion rate using machine learning. Our previous work involves data
preprocessing wherein we performed data cleaning, feature scaling and one-hot
encoding so that the data can be fed into machine learning models for prediction,
further performing data visualization to get valuable insights and inferences from
the data which can be interpreted and thus can be utilized to support better business
decision making and support conclusion in order to optimize the sales. Previously
we used supervised machine learning model i.e. Logistic Regression, Random
Forest, Support Vector Machine, K-nearest neighbor, Naive Bayes for predicting the
most potential customers.

Data Preprocessing

Data Cleaning - Data cleaning is the process of fixing or removing incorrect,

corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset.
When combining multiple data sources, there are many opportunities for data to be
duplicated or mislabeled.

Feature Scaling - Feature scaling is a method used to normalize the range of

independent variables or features of data.

One Hot Encoding - One hot encoding is a process by which categorical variables
are converted into a form that could be provided to ML algorithms to do a better job
in prediction.
6

Data Visualization
Some of the data visualization of important features from our previous work:
Specialization

Occupation
7

Machine Learning Models

In our previous we used five different supervised learning models:
1. Logistic Regression
2. K Nearest Neighbour
3. Support Vector Machine
4. Naive Bayes
5. Random Forest Classification

Logistic Regression
Logistic regression is a predictive analysis regression technique where the dependent
variable is categorical. Logistic regression is used to describe data and to explain
the relationship between one dependent binary variable and one or more independent
variables. It uses a logistic function called the sigmoid function as its activation
function which outputs the value between 0 and 1.
Best accurate hyperparameter tuned logistic regression model:

Optimizer Regularization Number of Accuracy

iterations

lbfgs L2 200 91.36

*lbfgs- Limited-memory Broyden–Fletcher–Goldfarb–Shanno Algorithm

K-nearest neighbor
The principle behind nearest neighbour methods is to find a predefined number of
training samples closest in distance to the new point, and predict the label from these.
The number of samples can be a user-defined constant (k-nearest neighbour
learning), or vary based on the local density of points (radius-based neighbour
learning).
8

Best accurate hyperparameter tuned k-nearest neighbor model:

Metrics Weights Number neighbors Accuracy

Mikowski Uniform 5 91.06

Support Vector Machine

Support Vector Machine (SVM) is a machine learning algorithm which is be used
for classification problems. In the SVM algorithm, we plot each data item as a point
in n-dimensional space (where n is number of features specified) with the value of
each feature being the value of a particular coordinate. Then, the algorithm performs
classification by finding the hyper-plane that differentiates the two classes very well.

Best accurate hyperparameter tuned Support vector machine model:

Kernel Gamma C Accuracy

Linear Scale 1 92.37

Naive Bayes
Naive Bayes methods are a set of supervised learning algorithms based on applying
Bayes’ theorem with the “naive” assumption of conditional independence between
every pair of features given the value of the class variable

Best accurate hyperparameter tuned Support vector machine model:

Naïve Bayes Algorithm Accuracy

Gaussian 87.54
9

Random Forest
Random forest is a supervised learning algorithm which is used for classification
problems. It is an ensemble tree-based learning algorithm. It is a set of decision trees
from a randomly selected subset of the training set. The algorithm creates decision
trees on data samples and then gets the prediction from each of them and finally
selects the best solution by means of voting.
Best accurate hyperparameter tuned Support vector machine model:

Criterion of split Max no. of Number estimators Accuracy

features
Entropy Auto 150 91.11
10

Performance Metrics Beyond Accuracy

Confusion Matrix

A confusion matrix is a table that is often used to describe the performance of a

classification model (or "classifier") on a set of test data for which the true values
are known.

Precision

The ratio of correct positive predictions to the total predicted positives.

Precision = (TP) / (TP+FP)

Recall

The ratio of correct positive predictions to the total positives examples.

Recall = (TP) / (TP+FN)

Threshold Value

The output of a classification model is a probability. We can select a threshold value.

If the probability is greater than this threshold value, the event is predicted to happen
otherwise it is predicted not to happen. A confusion or classification matrix
compares the actual outcomes to the predicted outcomes.
11

Precision-Recall Curve

A precision-recall curve (or PR Curve) is a plot of the precision and the recall for
different probability thresholds.In our project we selected the optimum threshold
value that gives us the best precision and recall value. The optimum threshold value
comes out to be 0.2 as shown below in the precision-recall curve.

Results Obtained For Support Vector Machine Model

Accuracy- 92.37

Precision- 90.06

Recall- 86.32
12

FEATURE EXTRACTION
Companies have more data than ever, so it’s crucial to know the difference between
Useful Data and Unuseful Data. Amongst the important aspects in Machine Learning
are “Feature Selection” and “Feature Extraction”. Problem of selecting some
subset of a learning algorithm’s input variables upon which it should focus attention,
while ignoring the rest. Feature Selection can significantly improve a learning
algorithm’s performance. Feature extraction involves reducing the number of
resources required to describe a large set of data. When performing analysis of
complex data one of the major problems stems from the number of variables
involved. Analysis with a large number of variables generally requires a large
amount of memory and computation power, also it may cause a classification
algorithm to over fit to training samples and generalize poorly to new samples.

Out of the 37 features present in the data many features have the value as “NO” as
the primary answer (up to 90%), hence not much data could have been extracted
from them even if we try to run the machine learning models on them. Not only these
features will reduce the overall quality of the data but also degrade the model’s
learning ability. These features were dropped after the processing of the data.

RECURSIVE FEATURE ELIMINATION

Recursive feature elimination (RFE) is a feature selection method that fits a model
and removes the weakest feature (or features) until the specified number of features
is reached. Recursive Feature Elimination recursively removes features, builds a
model using the remaining attributes and calculates model accuracy. RFE is able to
work out the combination of attributes that contribute to the prediction on the target
variable Features are ranked by the model’s feature_importance attributes, and by
recursively eliminating a small number of features per loop, RFE attempts to
eliminate dependencies and co-linearity that may exist in the model.
RFE requires a specified number of features to keep, however it is often not known
in advance how many features are required. To find the optimal number of features
cross-validation is used with RFE to score different feature subsets and select the
best scoring collection of features. Using RFE for each type of model all the features
relevant for that specific model were selected from the dataset and the rest of the
13

features were dropped. Finally the data set was split randomly in the training test and
test set.
The features selected by Recursive feature elimination are given below-

PRINCIPAL COMPONENT ANALYSIS

Principal Component Analysis (PCA) is an unsupervised, non-parametric statistical
technique primarily used for dimensionality reduction in machine learning .High
dimensionality means that the dataset has a large number of features. The primary
problem associated with high-dimensionality in the machine learning field is model
overfitting, which reduces the ability to generalize beyond the examples in the
training set.Models also become more efficient as the reduced feature set boosts
learning rates and diminishes computation costs by removing redundant
features.PCA can also be used to filter noisy datasets, such as image compression.
The first principal component expresses the most amount of variance. Each
additional component expresses less variance and more noise, so representing the
data with a smaller subset of principal components preserves the signal and discards
the noise.
14

PCA Workflow-
1. Normalize the data - PCA is used to identify the components with the maximum
variance, and the contribution of each variable to a component is based on its
magnitude of variance. It is best practice to normalize the data before conducting a
PCA as unscaled data with different measurement units can distort the relative
comparison of variance across features.

2. Create a covariance matrix - A useful way to get all the possible relationships
between all the different dimensions is to calculate the covariance among them all
and put them in a covariance matrix which represents these relationships in the data.

3. Select the optimal number of principal components-The optimal number of

principal components is determined by looking at the cumulative explained variance
ratio as a function of the number of components.
15

NEURAL NETWORK

A neural network is a computational learning system that uses a network of functions

to understand and translate a data input of one form into a desired output, usually in
another form.Machine learning algorithms that use neural networks generally do not
need to be programmed with specific rules that define what to expect from the input.
The neural net learning algorithm instead learns from processing many labeled
examples that are supplied during training and using this answer key to learn what
characteristics of the input are needed to construct the correct output. Once a
sufficient number of examples have been processed, the neural network can begin
to process new, unseen inputs and successfully return accurate results.

The architecture of neural networks is as follows:-

1. The input layer, which takes training values as inputs.

2. Hiddens layers which tries to establish the hidden relation between input and
output variables.
3. Output layer which provides the final prediction of the model.
16

Hyperparameters
Hyperparameters are important because they directly control the behaviour of the
training algorithm and have a significant impact on the performance of the model
being trained.
Needs of hyperparameter tuning:-
1. To find the right balance between bias and variance
2. To prevent the model from falling into vanishing/exploding gradient
problem
3. Encountering local optima.
4. Prevent the no convergence of the model.
Hyperparameter tuning done in our models:-
1. Number of Layers:- It was chosen in accordance with the thought that a very
high number may introduce problems like overfitting and vanishing and
exploding gradient problems and a lower number may cause a model to have
high bias and low potential model. The number of hidden layers during
hyperparameter tuning were in the range of 2 to 5.

2. Number of hidden units per layer:-It was also selected reasonably to find a
right spot between high bias and variance. It also depends on the data size
used for training. Hidden units in our models were in general powers of 2
between 128 to 1024 and were used in different combinations with different
models.

3. Activation Function:- Our choices in this are ReLU, Sigmoid & Tanh.
17

4. Optimizer:- It is the algorithm used by the model to update weights of every

layer after every iteration. Popular choices are SGD,RMSProp and Adam. All
three have different properties and are used according to the data . We
achieved great results by using RMSProp and Adam.
5. Learning Rate:- It is responsible for the core learning characteristic and we
chose it in such a way that it is not too high wherein the model is unable to
converge to minima and not too low such that the model is unable to speed up
the learning process. We tried in powers of 10, specifically 0.001,0.01, 0.1,1.

6. Batch Size:- It is indicative of the number of patterns shown to the network

before the weight matrix is updated. If batch size is less, patterns would be
18

less repeating and hence the weights would be all over the place and
convergence would become difficult. If batch size is high learning would
become slow as only after many iterations will the batch size change. We tried
batch sizes in hta range of 32 to 512 and in general the batches which were
power of 2 gave the best results.
7. Number of Epochs:- The number of epochs is the number of times the entire
training data is shown to the model. It played an important role in how well
the model fits on the train data. High number of epochs overfitted on our data
.Lower number of epochs also limited the potential of the model leading to
underfitting. A large number of different epochs were used while training the
models and many lead to overfit or underfit the data . Epochs in the range of
18 to 23 gave the best results.
8. Dropout:- The keep-probability of the Dropout layer can be thought of as a
hyper-parameter which could act as a regularizer to help us find the optimum
bias-variance spot. While using dropout the models drop certain connections
every iteration therefore the hidden units cannot depend a lot on any particular
feature. The values it can take can be anywhere between 0 to 1 and it is solely
based on how much the model is overfitting. When we used a 5 layer deep
neural network there was a huge problem of overfitting which we tried to
overcome using high dropouts . In general while using 2 layer deep neural
networks we experimented with dropout’s keep probability values between
0.2 to 0.4.
9. L1/L2 Regularization:- It serves as another regularizer wherein the very high
weight values are curbed so that the model is not dependent on a single
feature. This generally reduces variance with a trade-off of increasing bias i.e.
lowering accuracy. We experimented with both types of regularizations
techniques and after feature selection we generally used L2 regularization.
19

Visual Representation of models

1. Overfitted model

Number of hidden layers:-4

No. of nodes in each layer :- 512 , 512 , 300 , 256
Activation :- Relu , Optimizer :- Adam , Learning rate :-0.003
20

2. Under Fitted model

Number of hidden layers:-1

No. of nodes in each layer :- 512
Activation :- Tanh , Optimizer :- RMSProp , Learning rate :-0.2
21

3.Best Fit model

Number of hidden layers:-2

No. of nodes in each layer :- 512 , 512
Activation :- Relu , Optimizer :- Adam , Learning rate :-0.003
Dropout of 0.3 in both hidden layers.
22

Course Recommendation System

Four types of recommendation systems have been implemented which are described
as follows-
1. Simple Recommender -
The Simple Recommender offers generalized recommendations to every user based
on course popularity. The basic idea behind this recommender is that courses that
are more popular and more critically acclaimed will have a higher probability of
being liked by the average users. The main drawback of this recommender system
is that it does not give personalized recommendations based on the user.
In a simple recommendation system, weighted ratings were calculated and on the
basis of that recommendations were made.

Weighted Rating (WR) = (v/(v+m))R + (m/(v+m))C

where,

● v is the number of votes for the course

● m is the minimum votes required to be listed in the chart
● R is the average rating of the course
● C is the mean vote across the whole report.

2. Content Based Recommender -

To personalise our recommendations more, an engine that computes similarity
between courses based on certain metrics has been built and courses that are most
similar to a particular course that a user liked were suggested.
Two content based recommender system were implemented based on certain
metrics-
1. Tf idf vectorizer was applied on the description of the courses and similarity
scores were retrieved by applying cosine similarity on the tf idf matrix.
2. Tf idf vectorizer was applied on course organization, course instructor and
similarity scores were calculated by cosine similarity.

User’s taste and history were not considered in this recommender system.
23

3. Collaborative filtering -

Collaborative Filtering is based on the idea that users similar to us can be used to
predict how much we will like a particular product or service those users have
used/experienced but we have not. Singular value decomposition (SVD) was used
to predict how much rating a user will give to a particular course.

4. Hybrid Recommender -

In this system, all three recommender systems were combined to obtain optimised
course recommandations.
First, recommendations from content based recommenders were retrieved and then
sorted retrieved courses on the basis of weighted ratings obtained from a simple
recommender system and then predicted the ratings of these retrieved courses that a
particular user will give using collaborative filtering. In this way, this system
considers the user's taste, user’s history and popularity of courses.
24

Algorithms

Tf-idf vectorizer -

TF-IDF is an abbreviation for Term Frequency Inverse Document Frequency.

This is a very common algorithm to transform text into a meaningful representation
of numbers which is used to fit machine algorithms for prediction.

Term Frequency -
The number of times a word appears in a document divided by the total number of
words in the document. Every document has its own term frequency.

Inverse Data Frequency (IDF) -

The log of the number of documents divided by the number of documents that
contain the word w. Inverse data frequency determines the weight of rare words
across all documents in the corpus.

TF-IDF is simply the TF multiplied by IDF.

1. Cosine similarity -
Cosine Similarity is a measurement that quantifies the similarity between two
or more vectors.The cosine similarity is the cosine of the angle between
vectors.
The cosine similarity is described mathematically as the division between the
dot product of vectors and the product of the euclidean norms or magnitude
of each vector.

Results

Results from simple recommender system -

In this system, an input field is given and on the basis of weighted ratings courses
are recommended.
Let’s say input field is “Statistics” then our system recommends following courses-

Courses Course Organisation Weighted Ratings

Business Statistics and Rice University 8.21
Analysis
Introduction to Statistics Imperial College London 8.1
& Data Analysis in
Public Health
Methods and Statistics in University of Amsterdam 7.98
Social Sciences
Statistics with Python University of Michigan 7.95
Statistics with R Duke University 7.93
26

Results from content based recommender system -

In this system, an input course is given and similar courses with respect to input are
recommended on the basis of similarity metrics.
Let’s say input course is “Introduction to Data Science in python by university of
michigan” then our system recommends courses that are shown in the table below:

Courses Organization
Advanced Data Science with IBM IBM
Data Science Johns Hopkins University
Mathematics for Data Science National Research University Higher
School of Economics
Applied Machine Learning in Python University of Michigan
Databases and SQL for Data Science IBM

Results from collaborative filtering -

In collaborative filtering , course rating is predicted by using the SVD algorithm.

For a particular user, ratings of all the courses for which the user has not given rating
were predicted and sorted in descending order of these predicted ratings to
recommend courses. This model tries to predict ratings based on how the other users
have predicted the course. Two inputs are taken by this system from the dataset-
1. Course_id
2. User_id

In the code snippet shown below, the function takes predictions from SVD and how
many courses to recommend as inputs and shows top recommendations to each user.
27

Results from hybrid recommender -

This recommender system takes two inputs - User_id and course.
Let’s say two inputs are - (1, “Supply Chain Finance and Blockchain Technology”)
then recommendations for user with user_id 1 are shown below-

Courses Estimated Rating

Foundational Finance for Strategic 8.4
Decision Making
Finance for Non-Finance Professionals 8.10
Corporate Finance Essentials 8.05
Behavioral Finance 8.03
Understanding Modern Finance 7.98
28

CONCLUSION

In our overall BTP work, a detailed & comprehensive analysis of data form e-
learning platform was performed. Analysis of the data showed how a lot of
inferences can be drawn from the raw data. Raw data cleaning was done which
included working on missing value, encoding categorical variables and feature
scaling of the variables.

For feature extraction PCA and RFE(Recursive Feature Elimination) was used
which greatly affected the models performance. The extracted features were then
used as input variables for 5 classification models which contained a huge range of
hyper parameters. Tuning of those parameters revealed the facts of how greatly a
model is affected by these hyper parameters. The SVM model using the linear kernel
resulted in the highest accuracy of 92.37%. But as it turns out the other models like
Logistic Regression (accuracy of 91.36%), K Nearest Neighbour (accuracy of
91.06%) and Random Forest (accuracy of 91.11%) all showed an accuracy very
close to SVM. The results showed that even though there are a great number of
different classification algorithms which are all great at making predictions, one of
the major factors which greatly influenced the accuracy is how data preprocessing
is done. Features extraction and feature selection plays a vital role in models
performance.

Classification of customers was also done using Neural Networks which contained
a lot of hyperparameters . With the correct sets of hyperparameters like the number
of hidden layers, nodes in each layer, the learning rate , type of optimization and
activation functions etc played a vital role in the performance of the model. Deep
neural networks tend to overfit on the training data which we overcame by using
dropouts and different regularization techniques.

Using the data of different courses on the website along with user specific data we
also developed a recommendation system which can suggest the most likely courses
the user will be interested in . This recommended course data can be helpful to the
sales team which can further increase the conversion rate as these courses will be
tailored towards the user that already has a high probability of buying the course.
Hence with this the targeted user specific system the courses bought per user can
also be increased along with the conversion rate.

Gradient - AI by Hand Workbook
No ratings yet
Gradient - AI by Hand Workbook
26 pages
Report of Industrial Training
No ratings yet
Report of Industrial Training
22 pages
Dimensionality Reduction Algorithms
No ratings yet
Dimensionality Reduction Algorithms
34 pages
Krce
No ratings yet
Krce
71 pages
Big Sales Prediction Model Using Machine Learning1
No ratings yet
Big Sales Prediction Model Using Machine Learning1
21 pages
Project Report: Application of Machine Learning
No ratings yet
Project Report: Application of Machine Learning
12 pages
Ml Customer Segmentation
No ratings yet
Ml Customer Segmentation
39 pages
Internship Doc (A1)
No ratings yet
Internship Doc (A1)
20 pages
First and Last
No ratings yet
First and Last
68 pages
mini project on ml
No ratings yet
mini project on ml
20 pages
17BIT202
No ratings yet
17BIT202
25 pages
Fake Review Detection Prj2 (1)
No ratings yet
Fake Review Detection Prj2 (1)
30 pages
BUSINESS FORECASTING SYSTEM 181103 Update 29 12 22
No ratings yet
BUSINESS FORECASTING SYSTEM 181103 Update 29 12 22
52 pages
7th Sem Final Report
No ratings yet
7th Sem Final Report
67 pages
Cse-F Batch8 Finaldoc
No ratings yet
Cse-F Batch8 Finaldoc
81 pages
An Anaya
No ratings yet
An Anaya
40 pages
Final Report Phase-1
No ratings yet
Final Report Phase-1
23 pages
1822 B.E Cse Batchno 149
No ratings yet
1822 B.E Cse Batchno 149
48 pages
Student Performance Analysis Using Machine Learning
No ratings yet
Student Performance Analysis Using Machine Learning
40 pages
Pimpri Chinchwad College of Engineering & Research Ravet, Pune
No ratings yet
Pimpri Chinchwad College of Engineering & Research Ravet, Pune
4 pages
Final Modified Document PG
No ratings yet
Final Modified Document PG
58 pages
Project Report Final2
No ratings yet
Project Report Final2
82 pages
1VISVESVARAYA TECHNOLOGICAL UNIVERSITY
No ratings yet
1VISVESVARAYA TECHNOLOGICAL UNIVERSITY
29 pages
it
No ratings yet
it
45 pages
Thesis Machine Learning
No ratings yet
Thesis Machine Learning
29 pages
Final Report
No ratings yet
Final Report
60 pages
Artificial Intelligence and Machine Learning for EDGE Computing 1st Edition Rajiv Pandey - eBook PDF - Download the ebook and explore the most detailed content
100% (2)
Artificial Intelligence and Machine Learning for EDGE Computing 1st Edition Rajiv Pandey - eBook PDF - Download the ebook and explore the most detailed content
56 pages
Final Report
No ratings yet
Final Report
76 pages
1.3.2 Final
No ratings yet
1.3.2 Final
72 pages
Business Forecasting System 181103
No ratings yet
Business Forecasting System 181103
51 pages
Mridul Report
No ratings yet
Mridul Report
43 pages
Visvesvaraya Technological University: City Engineering College
No ratings yet
Visvesvaraya Technological University: City Engineering College
31 pages
phase 1 report
No ratings yet
phase 1 report
14 pages
Adnan
No ratings yet
Adnan
19 pages
Final Review Batch 07
No ratings yet
Final Review Batch 07
30 pages
Thesis Machine Learning
No ratings yet
Thesis Machine Learning
28 pages
Mini Project Report 2024 IS07
No ratings yet
Mini Project Report 2024 IS07
29 pages
A Project Report On
No ratings yet
A Project Report On
90 pages
sodapdf-converted (2) (1)
No ratings yet
sodapdf-converted (2) (1)
6 pages
Career Guidance Finale
No ratings yet
Career Guidance Finale
47 pages
Hands On Machine Learning With Scikit Learn and TensorFlow Techniques and Tools to Build Learning Machines 3rd Edition by OReilly Media ISBN 9781098122461 1098122461 pdf download
100% (1)
Hands On Machine Learning With Scikit Learn and TensorFlow Techniques and Tools to Build Learning Machines 3rd Edition by OReilly Media ISBN 9781098122461 1098122461 pdf download
49 pages
empowering small companies with automated sales forecasting
No ratings yet
empowering small companies with automated sales forecasting
66 pages
Seminar Report
No ratings yet
Seminar Report
69 pages
Swapnilreport
No ratings yet
Swapnilreport
42 pages
Auto ML Tool For Supervised Machine Learning Data
No ratings yet
Auto ML Tool For Supervised Machine Learning Data
11 pages
Final report of mini project
No ratings yet
Final report of mini project
52 pages
Sanjay
No ratings yet
Sanjay
38 pages
Major (1) 1 (1) 2
No ratings yet
Major (1) 1 (1) 2
52 pages
Project Presentation
No ratings yet
Project Presentation
42 pages
Pbl Report Final “Arman Vats (202100453)”
No ratings yet
Pbl Report Final “Arman Vats (202100453)”
24 pages
Binding Vamshi 1
No ratings yet
Binding Vamshi 1
48 pages
Adnan
No ratings yet
Adnan
21 pages
Aiml Report
No ratings yet
Aiml Report
70 pages
1822 B.E Ece Batchno 63
No ratings yet
1822 B.E Ece Batchno 63
106 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Final Year Project
No ratings yet
Final Year Project
41 pages
Machinelearningfile
No ratings yet
Machinelearningfile
49 pages
documentation sample
No ratings yet
documentation sample
72 pages
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
From Everand
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
Sebastian Raschka
4/5 (20)
(Report)
No ratings yet
(Report)
40 pages
Thesis
No ratings yet
Thesis
73 pages
Comparison of Feature Importance Measures As Expla
No ratings yet
Comparison of Feature Importance Measures As Expla
13 pages
Assignment 6
No ratings yet
Assignment 6
1 page
SOP - Personal Statement
No ratings yet
SOP - Personal Statement
2 pages
Campus X NLP Lecture 1
No ratings yet
Campus X NLP Lecture 1
2 pages
Seminar Report
No ratings yet
Seminar Report
15 pages
Admission Prediction - Ipynb
No ratings yet
Admission Prediction - Ipynb
42 pages
Gasoline Engine: Work Distance X Force
No ratings yet
Gasoline Engine: Work Distance X Force
15 pages
Mini Question bank 6th sem-1903BS005-MACHINE LEARNING
No ratings yet
Mini Question bank 6th sem-1903BS005-MACHINE LEARNING
3 pages
Amr Abdellatif CV
No ratings yet
Amr Abdellatif CV
2 pages
rag-llm-asr
No ratings yet
rag-llm-asr
5 pages
SC - M4 -Ktunotes.in
No ratings yet
SC - M4 -Ktunotes.in
26 pages
Carbon Accounting Reporting and AI Driven Climate Solutions
No ratings yet
Carbon Accounting Reporting and AI Driven Climate Solutions
8 pages
Tourism Enhancement Using LLMs & Neural Network_Report (1)
No ratings yet
Tourism Enhancement Using LLMs & Neural Network_Report (1)
37 pages
Resume 1
No ratings yet
Resume 1
3 pages
Seven Activities To Engage System Thinking
No ratings yet
Seven Activities To Engage System Thinking
12 pages
HCI Lec 1 Handout PDF
No ratings yet
HCI Lec 1 Handout PDF
5 pages
Career Transition ML & AI
No ratings yet
Career Transition ML & AI
14 pages
AI for lesson planning
No ratings yet
AI for lesson planning
37 pages
The Human Bot
No ratings yet
The Human Bot
153 pages
Impact of AI On Technical Writing
No ratings yet
Impact of AI On Technical Writing
15 pages
UMG v. Udio
No ratings yet
UMG v. Udio
41 pages
Soft Skills
No ratings yet
Soft Skills
21 pages
ACT Enhanced Writing
No ratings yet
ACT Enhanced Writing
10 pages
Dr. Robot or Dr. Efficiency - The Impact of Robotic Process Automa
No ratings yet
Dr. Robot or Dr. Efficiency - The Impact of Robotic Process Automa
55 pages
What Is A Resilient Supply Chain
No ratings yet
What Is A Resilient Supply Chain
5 pages
Awareness of Artificial Intelligence: Diffusion of Information About Ai Versus Chatgpt in The United States
No ratings yet
Awareness of Artificial Intelligence: Diffusion of Information About Ai Versus Chatgpt in The United States
27 pages
Module 3 IMLAUs QB
No ratings yet
Module 3 IMLAUs QB
3 pages
Human Behavior and Emerging Technologies - 2023 - Xu - Transparency Enhances Positive Perceptions of Social Artificial
No ratings yet
Human Behavior and Emerging Technologies - 2023 - Xu - Transparency Enhances Positive Perceptions of Social Artificial
15 pages
1 The Role of Technology in Shaping The Future
50% (2)
1 The Role of Technology in Shaping The Future
2 pages
Zhongan: A New Generation Digital Insurer: Figure 1: Global Insurtech Investments ($ Million) - Source: CB Insights
No ratings yet
Zhongan: A New Generation Digital Insurer: Figure 1: Global Insurtech Investments ($ Million) - Source: CB Insights
8 pages
Artificial Intelligence in Special Education: A Decade Review
No ratings yet
Artificial Intelligence in Special Education: A Decade Review
8 pages
Course Work Completion
100% (2)
Course Work Completion
7 pages
Minor_in_AI_Vizuara_Engineering_Curriculum_COEP (1)
No ratings yet
Minor_in_AI_Vizuara_Engineering_Curriculum_COEP (1)
9 pages
Artificial Intelligence and Its Impact On Employment
No ratings yet
Artificial Intelligence and Its Impact On Employment
4 pages
Expound
No ratings yet
Expound
13 pages
Deep Learning Hardware
No ratings yet
Deep Learning Hardware
82 pages

BTP Report Final 1

Uploaded by

BTP Report Final 1

Uploaded by

1

Enhancing Sales of E-Learning Platforms Using

A Project Report To Be Submitted To Indian Institute of Technology (BHU), Varanasi In

KAPIL KUMAR SHARMA 17135048

Under The Esteem Guidance Of

A recommendation engine filters the data using different algorithms and

Data Cleaning - Data cleaning is the process of fixing or removing incorrect,

Feature Scaling - Feature scaling is a method used to normalize the range of

Machine Learning Models

Optimizer Regularization Number of Accuracy

lbfgs L2 200 91.36

Best accurate hyperparameter tuned k-nearest neighbor model:

Metrics Weights Number neighbors Accuracy

Support Vector Machine

Best accurate hyperparameter tuned Support vector machine model:

Kernel Gamma C Accuracy

Linear Scale 1 92.37

Best accurate hyperparameter tuned Support vector machine model:

Naïve Bayes Algorithm Accuracy

Criterion of split Max no. of Number estimators Accuracy

Performance Metrics Beyond Accuracy

A confusion matrix is a table that is often used to describe the performance of a

The ratio of correct positive predictions to the total predicted positives.

Precision = (TP) / (TP+FP)

The ratio of correct positive predictions to the total positives examples.

Recall = (TP) / (TP+FN)

The output of a classification model is a probability. We can select a threshold value.

Results Obtained For Support Vector Machine Model

RECURSIVE FEATURE ELIMINATION

PRINCIPAL COMPONENT ANALYSIS

3. Select the optimal number of principal components-The optimal number of

A neural network is a computational learning system that uses a network of functions

The architecture of neural networks is as follows:-

1. The input layer, which takes training values as inputs.

4. Optimizer:- It is the algorithm used by the model to update weights of every

6. Batch Size:- It is indicative of the number of patterns shown to the network

Visual Representation of models

Number of hidden layers:-4

2. Under Fitted model

Number of hidden layers:-1

3.Best Fit model

Number of hidden layers:-2

Course Recommendation System

Weighted Rating (WR) = (v/(v+m))*R + (m/(v+m))*C

● v is the number of votes for the course

2. Content Based Recommender -

TF-IDF is an abbreviation for Term Frequency Inverse Document Frequency.

Inverse Data Frequency (IDF) -

TF-IDF is simply the TF multiplied by IDF.

Results from simple recommender system -

Courses Course Organisation Weighted Ratings

Results from content based recommender system -

Results from collaborative filtering -

In collaborative filtering , course rating is predicted by using the SVD algorithm.

Results from hybrid recommender -

Courses Estimated Rating

You might also like

Weighted Rating (WR) = (v/(v+m))R + (m/(v+m))C