Netflix Movie Recommendation System
Netflix Movie Recommendation System
Project I Report
On
“NETFLIX MOVIE RECOMMENDATION SYSTEM”
We hereby certify that the work which is being presented in the Project I Report entitled,
“NETFLIX MOVIE RECOMMENDATION SYSTEM” by us, Ashutosh Panwar(1220188)
in partial fulfillment of the requirements for the award of the degree of Bachelor of Technology
in Computer Science Engineering submitted in the Department of Computer Science and
Engineering at JMIT Radaur (Affiliated to Kurukshetra University, Kurukshetra,
Haryana (India)) is an authentic record of our own work carried out under the supervision of
Dr. Monika. The matter presented in the report has not been submitted to any other
University/Institute for the award of any degree.
Ashutosh Panwar
This is to certify that the above statement made by the candidate is correct to the best of
my/our knowledge.
Dr. Monika
A.P, Department of CSE. JMIT Radaur
The writing of this project report has been assisted by the generous help of many people. We feel
that we were very fortunate to receive assistance from them. We wish to express our sincere
appreciation to them.
First and foremost, we are indebted to our principal supervisor, Dr. Monika (A.P, Department of
Computer Science and Engineering) of JMIT Radaur, who has been very supportive at every
stage of our project completion. We wish to express our utmost gratitude to him for her
invaluable advice and patience in reading, correcting, and commenting.
First and foremost, we are indebted to our principal supervisor, Dr. Monika (A.P, Department of
Computer Science and Engineering) of JMIT Radaur, who has been very supportive at every
stage of our project completion. We wish to express our utmost gratitude to him for her
invaluable advice and patience in reading, correcting, and commenting on the drafts of this report
and, more importantly, for his generosity which we have received throughout our project
completion.
We wish to express our thanks to all staff members of JMIT Radaur, who also helped us in
conducting this study.
Finally, we are particularly indebted to our dearest parents/guardians as without their generous
assistance and love; this project could never have been completed.
Ashutosh Panwar
(1220188)
3
Abstract
In this hustling world, entertainment is a necessity for each one of us to refresh our mood and
energy. Entertainment regains our confidence for work and we can work more enthusiastically.
For revitalizing ourselves, we can listen to our preferred music or can watch movies of our
choice. For watching favorable movies online we can utilize movie recommendation systems,
which are more reliable, since searching for preferred movies will require more and more time
which one cannot afford to waste. In this paper, to improve the quality of a movie
recommendation system, a Hybrid approach by combining content-based filtering and
collaborative filtering, using Support Vector Machine as a classifier and a genetic algorithm is
presented in the proposed methodology and comparative results have been shown which depicts
that the proposed approach shows an improvement in the accuracy, quality, and scalability of the
movie recommendation system than the pure approaches in three different datasets. The hybrid
recommendation system combines both content-based and collaborative filtering algorithms that
predict the user's interest in movies.
4
Table of Contents
Declaration………………...……………………………………………………………..……..i
Acknowledgement………………………………………………………………………..…….ii
Abstract …………………………………………………………………………………..……iii
Table of Contents ………………………………………………………………………..…….iv
List of Figures……………………………………………………………………...…………..v
Chapter 1. INTRODUCTION
I. Relevance of the Project 9
II. Problem Statement 9
III. Objective 10
IV. Scope of the Project 10
V. Methodology 11
I. Hardware Requirements 16
II. Software Specification 16
III. Software Requirements 17
I. System Architecture 20
II. Activity Diagram 21
III. Flowchart 22
Chapter 5. IMPLEMENTATION
I. Cosine similarity 24
II. Cosine similarity 24
III. Experimental Setup 25
IV. Front-End/Back-End implementation details 26
Chapter 7. Testing 33
Chapter 9. Conclusion 38
6
List of figures
7
Chapter 1
Introduction
8
A recommendation system or recommendation engine is a model used for information filtering
where it tries to predict the preferences of a user and provide suggests based on these
preferences. These systems have become increasingly popular nowadays and are widely used
today in areas such as movies, music, books, videos, clothing, restaurants, food, places and other
utilities. These systems collect information about a user's preferences and behaviour, and then
use this information to improve their suggestions in the future
Movies are a part and parcel of life. There are different types of movies like some for
entertainment, some for educational purposes, some are animated movies for children, and some
are horror movies or action films. Movies can be easily differentiated through their genres like
comedy, thriller, animation, action etc. Other way to distinguish among movies can be either by
releasing year, language, director etc. Watching movies online, there are a number of movies to
search in our most liked movies . Movie Recommendation Systems helps us to search our
preferred movies among all of these different types of movies and hence reduce the trouble of
spending a lot of time searching our favourable movies. So, it requires that the movie
recommendation system should be very reliable and should provide us with the recommendation
of movies which are exactly same or most matched with our preferences.
A large number of companies are making use of recommendation systems to increase user
interaction and enrich a user's shopping experience. Recommendation systems have several
benefits, the most important being customer satisfaction and revenue. Movie Recommendation
system is very powerful and important system. But, due to the problems associated with pure
collaborative approach, movie recommendation systems also suffers with poor recommendation
quality and scalability issues.
9
1.IV Scope of the Project
● The objective of this project is to provide accurate movie recommendations to users. The
goal of the project is to improve the quality of movie recommendation system, such as
accuracy, quality and scalability of system than the pure approaches. This is done using
Hybrid approach by combining content based filtering and collaborative filtering, To
eradicate the overload of the data, recommendation system is used as information
filtering tool in social networking sites .Hence, there is a huge scope of exploration in this
field for improving scalability, accuracy and quality of movie recommendation systems
Movie Recommendation system is very powerful and important system. But, due to the
problems associated with pure collaborative approach, movie recommendation systems
also suffers with poor recommendation quality and scalability issues.
Agile Methodology:
1. Collecting the data sets: Collecting all the required data set from Kaggle web site.in this
project we require movie.csv,ratings.csv,users.csv.
10
2. Data Analysis: make sure that that the collected data sets are correct and analysing the
data in the csv files. i.e. checking whether all the column Felds are present in the data
sets.
3. Algorithms: in our project we have only two algorithms one is cosine similarity and
other is single valued decomposition are used to build the machine learning
recommendation model.
4. Training and Testing the model: once the implementation of algorithm is completed .
we have to train the model to get the result. We have tested it several times the model is
recommend different set of movies to different users.
5. Improvements in the project: In the later stage we can implement different algorithms
and methods for better recommendation.
11
Chapter 2
Research Design & Methodology
12
Over the years, many recommendation systems have been developed using either collaborative,
content based or hybrid filtering methods. These systems have been implemented using various
big data and machine learning algorithms.
A recommendation system collect data about the user’s preferences either implicitly or explicitly
on different items like movies. An implicit acquisition in the development of movie
recommendation system uses the user’s behaviour while watching the movies. On the other
hand, a explicit acquisition in the development of movie recommendation system uses the user’s
previous ratings or history. The other supporting technique that are used in the development of
recommendation system is clustering. Clustering is a process to group a set of objects in such a
way that objects in the same clusters are more similar to each other than to those in other
clusters. KMeans Clustering along with K-Nearest Neighbour is implemented on the movie lens
dataset in order to obtain the best-optimized result. In existing technique, the data is scattered
which results in a high number of clusters while in the proposed technique data is gathered and
results in a low number of clusters. The process of recommendation of a movie is optimized in
the proposed scheme. The proposed recommender system predicts the user’s preference of a
movie on the basis of different parameters. The recommender system works on the concept that
people are having common preference or choice. These users will influence on each other’s
opinions. This process optimizes the process and having lower RMSE.
13
number of nearest neighbours. We calculate correlation between users' ratings using
Pearson Correlation algorithm.
● Item-based filtering: Unlike the user-based filtering method, itembased focuses on the
similarity between the item’s users like instead of the users themselves. The most similar
items are computed ahead of time. Then for recommendation, the items that are most
similar to the target item are recommended to the user.
4. Data Analysis:
● The collected data from blogs, existing models, focus groups, user testing, and market
research were subjected to rigorous analysis.
● Qualitative data from blogs, focus groups, and user testing sessions were transcribed,
coded, and thematically analyzed to identify key insights and patterns.
● The data analysis provided a comprehensive understanding of user needs, preferences,
pain points, and market trends, which formed the foundation for decision-making in the
system design and development process.
● By employing these primary research techniques and conducting detailed analysis, a
comprehensive and data-driven understanding of user needs, market dynamics, and
industry trends was obtained. The findings from primary research informed key decisions
in the initial phase of system development for Voyance, ensuring that the proposed
system addresses user requirements, aligns with industry best practices, and delivers an
enhanced user experience.
14
Chapter 3
SYSTEM REQUIREMENTS SPECIFICATION
15
This chapter involves both the hardware and software requirements needed for the project and
detailed explanation of the specifications.
● A PC with Windows/Linux OS
● Processor with 1.7-2.4gHz speed
● Minimum of 8gb RAM
● 2gb Graphic card
17
Chapter 4
SYSTEM ANALYSIS AND DESIGN
18
4.1 System Architecture of Proposed System:
19
For each different individual use different list of movies are recommended ,as user
login or enters the user id based on two different approaches used in the project each
will recommend the set of movies to the particular user by combining the both the set
of movie based on the user the hybrid model will recommend the single list of movie
to the user.
4.3 Dataflow:
Initially load the data sets that are required to build a model the data set that are required in this
project are movies.csv, ratinfg.csv, users.csv all the data sets are available in the Kaggle.com.
Basically, two models are built in this project content based and collaborative filtering each
produce a list of movies to a particular user by combining both based on the useid a single final
list of movies are recommended to the particular user
Activity Diagram:
Once the user login by entering the userid i.e present in the csv file ranges from 1-5000 the list of
movie are recommended to the use
21
Chapter 5
IMPLEMENTATION
23
The Proposed System Make Use Different Algorithms and Methods for the implementation
of Hybrid Approach
Formula:
We first prove a simple lemma stating that two matrices A and B are identical if Av = Bv
for all v. The lemma states that in the abstract, a matrix A can be viewed as a transformation
that maps vector v onto Av
24
Experimental requirements:
Code: Front-end
In this project we have used popular front-end web framework (react.js) to build an
interactive user interface
In react.js we used axios npm module to fetch the data from the api that is generated from
flask
25
Backend :For backend we have use flask app to generate a local host api the
resultant api is fetch in front to display the result.
By using flask, we generate resulting api which stores the data in the form of json format
these data is retrieved in react by using axios npm mode and then displaying the data
26
Chapter 6
RESULTS AND DISCUSSION
28
6. Future Scope:
Since our project is movie recommendation system .one can develop a movie
recommendation system by using either content based or collaborative filtering or
combining both.
In our project we have developed a hybrid approach i.e combination of both content
and collaborative filtering .Both the approaches have advantages and dis-advantages
.in content based filtering the it based on the user ratings or user likes only such kind of
movie will recommended to the user.
Advantages: it is easy to design and it takes less time to compute
Dis-advantages: the model can only make recommendations based on existing interests
of the user. In other words, the model has limited ability to expand on the users'
existing interests.
Dis-advantages: The prediction of the model for a given (user, item) pair is the dot
product of the corresponding embeddings. So, if an item is not seen during training, the
system can't create an embedding for it and can't query the model with this item. This
issue is often called the cold-start problem.
The hybrid approach will resolves all these limitations by combining both content and
collaborative filtering
29
The main disadvantage in hybrid approach is it require high memory
30
Fig:-6.3 Display of list of recommended movies
Once the name of movie is entered the list of recommended movies are displayed
31
Chapter 7
TESTING
32
System testing is actually a series of different tests whose primary purpose is to fully
exercise the computer-based system. Although each test has a different purpose, all work to
verify that all the system elements have been properly integrated and perform allocated
functions. The testing process is actually carried out to make sure that the product exactly
does the same thing what is supposed to do. In the testing stage following goals are tried to
achieve: -
7. TESTING METHODOLOGIES:
There are many different types of testing methods or techniques used as part of the
software testing methodology. Some of the important testing methodologies are:
Unit Testing:
33
Unit testing is the first level of testing and is often performed by the developers
themselves. It is the process of ensuring individual components of a piece of software
at the code level are functional and work as they were designed to. Developers in a
test-driven environment will typically write and run the tests prior to the software or
feature being passed over to the test team. Unit testing can be conducted manually, but
automating the process will speed up delivery cycles and expand test coverage. Unit
testing will also make debugging easier because finding issues earlier means they take
less time to fix than if they were discovered later in the testing process. Test Left is a
tool that allows advanced testers and developers to shift left with the fastest test
automation tool embedded in any IDE.
Integration Testing:
After each unit is thoroughly tested, it is integrated with other units to create modules or
components that are designed to perform specific tasks or activities. These are then tested
as group through integration testing to ensure whole segments of an application behave
as expected (i.e, the interactions between units are seamless). These tests are often
framed by user scenarios, such as logging into an application or opening files. Integrated
tests can be conducted by either developers or independent testers and are usually
comprised of a combination of automated functional and manual tests.
System Testing
System testing is a black box testing method used to evaluate the completed and
integrated system, as a whole, to ensure it meets specified requirements. The
functionality of the software is tested from end-to-end and is typically conducted by
a separate testing team than the development team before the product is pushed into
production.
34
Chapter 8
FUTURE SCOPE
35
8.1 Future scope:
In the proposed approach, It has considered Genres of movies but, in future we can also
consider age of user as according to the age movie preferences also changes, like for
example, during our childhood we like animated movies more as compared to other
movies. There is a need to work on the memory requirements of the proposed approach in
the future. The proposed approach has been implemented here on different movie datasets
only. It can also be implemented on the Film Affinity and Netflix datasets and the
performance can be computed in the future.
36
Chapter 9
CONCLUSION
37
8.1 Conclusion
In this project, to improve the accuracy, quality and scalability of movie recommendation
system, a Hybrid approach by unifying content based filtering and collaborative filtering;
using Singular Value Decomposition (SVD) as a classifier and Cosine Similarity is
presented in the proposed methodology. Existing pure approaches and proposed hybrid
approach is implemented on three different Movie datasets and the results are compared
among them. Comparative results depicts that the proposed approach shows an
improvement in the accuracy, quality and scalability of the movie recommendation system
than the pure approaches. Also, computing time of the proposed approach is lesser than the
other two pure approaches.
38
Chapter 10
Bibliography
39
Bibliography
● Netflix - Wikipedia
● https://github.com
● http://in.youtube.com/
● https://www.learnpython.org/en/Pandas_Basics
● https://en.wikipedia.org/wiki/Recommender_system
● https://www.imdb.com/list/ls063596142/
40
Chapter 11
Appendix
41
9.II Screen Print-Outs
42
Fig 9.2.3 Movie Page (a)
43
Fig 9.2.5 Netflix Page
44
Chapter 12
Plagiarism Report
Plagiarism Report