0% found this document useful (0 votes)

45 views11 pages

Movie Recommender Systems

Uploaded by

Smit Mandavia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views11 pages

Movie Recommender Systems

Uploaded by

Smit Mandavia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Project Title Movie Recommender Systems

Tools Jupyter Notebook and VS code

Technologies Data Science

Project Difficulties level intermediate

Dataset : Dataset is available in the given link. You can download it at your convenience.

Click here to download data set

Movies Recommender System

About Dataset
Context
These files contain metadata for all 45,000 movies listed in the Full MovieLens Dataset. The dataset consists of
movies released on or before July 2017. Data points include cast, crew, plot keywords, budget, revenue, posters,
release dates, languages, production companies, countries, TMDB vote counts and vote averages.

This dataset also has files containing 26 million ratings from 270,000 users for all 45,000 movies. Ratings are on a
scale of 1-5 and have been obtained from the official GroupLens website.

Content
This dataset consists of the following files:
movies_metadata.csv: The main Movies Metadata file. Contains information on 45,000 movies featured in the Full
MovieLens dataset. Features include posters, backdrops, budget, revenue, release dates, languages, production
countries and companies.

keywords.csv: Contains the movie plot keywords for our MovieLens movies. Available in the form of a stringified
JSON Object.

credits.csv: Consists of Cast and Crew Information for all our movies. Available in the form of a stringified JSON
Object.

links.csv: The file that contains the TMDB and IMDB IDs of all the movies featured in the Full MovieLens dataset.

links_small.csv: Contains the TMDB and IMDB IDs of a small subset of 9,000 movies of the Full Dataset.

ratings_small.csv: The subset of 100,000 ratings from 700 users on 9,000 movies.

The Full MovieLens Dataset consisting of 26 million ratings and 750,000 tag applications from 270,000 users on all
the 45,000 movies in this dataset can be accessed here

Acknowledgements
This dataset is an ensemble of data collected from TMDB and GroupLens.
The Movie Details, Credits and Keywords have been collected from the TMDB Open API. This product uses the
TMDb API but is not endorsed or certified by TMDb. Their API also provides access to data on many additional
movies, actors and actresses, crew members, and TV shows. You can try it for yourself here.

The Movie Links and Ratings have been obtained from the Official GroupLens website. The files are a part of the
dataset available here

Inspiration
This dataset was assembled as part of my second Capstone Project for Springboard's Data Science Career Track. I
wanted to perform an extensive EDA on Movie Data to narrate the history and the story of Cinema and use this
metadata in combination with MovieLens ratings to build various types of Recommender Systems.

Both my notebooks are available as kernels with this dataset: The Story of Film and Movie Recommender Systems

Some of the things you can do with this dataset:

Predicting movie revenue and/or movie success based on a certain metric. What movies tend to get higher vote
counts and vote averages on TMDB? Building Content Based and Collaborative Filtering Based Recommendation
Engines.

Movie Recommender System Machine Learning Project

This project involves building a movie recommender system using machine learning techniques. Here's a
step-by-step guide:

1. Problem Definition

Objective: Develop a movie recommender system that suggests movies to users based on their past behavior and
preferences.

2. Data Collection

For this example, we'll use the MovieLens dataset, which is commonly used for movie recommendation systems.
You can download it from MovieLens.

3. Data Preprocessing

import pandas as pd

# Load the datasets

movies = pd.read_csv('movies.csv')
ratings = pd.read_csv('ratings.csv')

# Display basic info and check for missing values

print(movies.info())
print(ratings.info())

print(movies.head())
print(ratings.head())

4. Exploratory Data Analysis (EDA)

import seaborn as sns

import matplotlib.pyplot as plt

# Basic statistics
print(ratings.describe())
# Histogram of ratings
ratings['rating'].hist(bins=30)
plt.title('Distribution of Movie Ratings')
plt.xlabel('Rating')
plt.ylabel('Count')
plt.show()

# Number of ratings per movie

ratings_per_movie = ratings.groupby('movieId').count()['rating']
ratings_per_movie.hist(bins=50)
plt.title('Number of Ratings per Movie')
plt.xlabel('Number of Ratings')
plt.ylabel('Count')
plt.show()

5. Building the Recommender System

Collaborative Filtering using Matrix Factorization (SVD)

from surprise import Dataset, Reader, SVD

from surprise.model_selection import cross_validate

# Load the data into Surprise format

reader = Reader(rating_scale=(0.5, 5))
data = Dataset.load_from_df(ratings[['userId', 'movieId', 'rating']], reader)

# Use SVD for matrix factorization

svd = SVD()

# Cross-validation to evaluate the algorithm

cross_validate(svd, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

Training the Model

trainset = data.build_full_trainset()
svd.fit(trainset)
6. Making Predictions

# Predict the rating for a specific user and movie

user_id = 1
movie_id = 10
rating_prediction = svd.predict(user_id, movie_id)
print(f"Predicted rating for user {user_id} and movie {movie_id}: {rating_prediction.est}")

7. Recommending Movies

# Function to recommend top N movies for a given user

def recommend_movies(user_id, num_recommendations=10):
# Get a list of all movie ids
movie_ids = movies['movieId'].unique()

# Predict ratings for all movies the user hasn't rated yet
movie_ratings = [svd.predict(user_id, movie_id).est for movie_id in movie_ids]

# Create a DataFrame of movie ids and predicted ratings

recommendations = pd.DataFrame({
'movieId': movie_ids,
'predicted_rating': movie_ratings
})

# Sort the DataFrame by predicted rating in descending order

recommendations = recommendations.sort_values(by='predicted_rating',
ascending=False)

# Get the top N recommended movies

top_recommendations = recommendations.head(num_recommendations)

# Merge with the movies DataFrame to get movie titles

top_recommendations = pd.merge(top_recommendations, movies, on='movieId')

return top_recommendations

# Recommend top 10 movies for user with ID 1

recommendations = recommend_movies(1, 10)
print(recommendations)
8. Deployment

To deploy the recommender system, you could create a simple web application using Flask.

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/recommend', methods=['POST'])
def recommend():
data = request.get_json(force=True)
user_id = data['user_id']
num_recommendations = data.get('num_recommendations', 10)
recommendations = recommend_movies(user_id, num_recommendations)
return jsonify(recommendations.to_dict(orient='records'))

if __name__ == '__main__':
app.run(debug=True)

9. Monitoring and Maintenance

Set up logging and monitoring to track the performance of your recommender system, and
schedule regular retraining with new data to keep the recommendations relevant.

10. Documentation and Reporting

Maintain comprehensive documentation of the project, including data sources, preprocessing

steps, model selection, and evaluation results. Create detailed reports and visualizations to
communicate findings and insights to stakeholders.

Tools and Technologies

● Programming Language: Python

● Libraries: pandas, numpy, seaborn, matplotlib, scikit-learn, Surprise, Flask
● Visualization Tools: Tableau, Power BI, or any dashboarding tool for advanced
visualizations
This is a basic outline of a movie recommender system project. Depending on your specific
goals and data, you may need to adjust the steps accordingly.

Sample Project Report

Simple Recommender¶
The Simple Recommender offers generalized recommnendations to every user based on movie popularity
and (sometimes) genre. The basic idea behind this recommender is that movies that are more popular and
more critically acclaimed will have a higher probability of being liked by the average audience. This model
does not give personalized recommendations based on the user.

The implementation of this model is extremely trivial. All we have to do is sort our movies based on ratings
and popularity and display the top movies of our list. As an added step, we can pass in a genre argument to
get the top movies of a particular genre.
Reference link

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
SRMDB - in (B28 - Research Paper)
No ratings yet
SRMDB - in (B28 - Research Paper)
5 pages
Mastering the Microsoft Deployment Toolkit
From Everand
Mastering the Microsoft Deployment Toolkit
Jeff Stokes
No ratings yet
Vaibhav - Project Report On Movie Recommender System Using Machine Learning
No ratings yet
Vaibhav - Project Report On Movie Recommender System Using Machine Learning
11 pages
B28 Viva
No ratings yet
B28 Viva
27 pages
Dl Project
No ratings yet
Dl Project
9 pages
PARNIT 05 PPT
No ratings yet
PARNIT 05 PPT
15 pages
MR Synopsis
No ratings yet
MR Synopsis
5 pages
Movie Recommendation System Using Machine Learning
No ratings yet
Movie Recommendation System Using Machine Learning
6 pages
Project Report on Movie Recommendation System
No ratings yet
Project Report on Movie Recommendation System
10 pages
DSBDA_Mini_Project
No ratings yet
DSBDA_Mini_Project
11 pages
Dsbda Mini Project Aissms Clg
No ratings yet
Dsbda Mini Project Aissms Clg
10 pages
MIT Data Science and Big Data Analytics Case Study
No ratings yet
MIT Data Science and Big Data Analytics Case Study
8 pages
04722af8-7737-4467-8b0e-610c08fdb296
No ratings yet
04722af8-7737-4467-8b0e-610c08fdb296
6 pages
3170724_ML_210490131009_OEP
No ratings yet
3170724_ML_210490131009_OEP
8 pages
Report Final-MovieLens
No ratings yet
Report Final-MovieLens
47 pages
Final Report
No ratings yet
Final Report
20 pages
2331_mid_program_project_v1_es3_d2i02jl
No ratings yet
2331_mid_program_project_v1_es3_d2i02jl
5 pages
Movies Final Report
No ratings yet
Movies Final Report
22 pages
Movie_Recommendation_System_project[1]
No ratings yet
Movie_Recommendation_System_project[1]
9 pages
Movie Recommendation System
No ratings yet
Movie Recommendation System
3 pages
IJRTI2207198
No ratings yet
IJRTI2207198
5 pages
ML Project Movie Recommendation System
No ratings yet
ML Project Movie Recommendation System
2 pages
Divya_NM[1]-2
No ratings yet
Divya_NM[1]-2
41 pages
Team 10 Movie Prediction
No ratings yet
Team 10 Movie Prediction
14 pages
Recommendation System
No ratings yet
Recommendation System
11 pages
Final Synopsis
No ratings yet
Final Synopsis
18 pages
Movie Recommendation Presentation
No ratings yet
Movie Recommendation Presentation
13 pages
Assignment 5zeerak
No ratings yet
Assignment 5zeerak
6 pages
Practical Work 1 - Recommender Systems
No ratings yet
Practical Work 1 - Recommender Systems
3 pages
Batch D17
No ratings yet
Batch D17
17 pages
dsv_final
No ratings yet
dsv_final
14 pages
rosp PPT
No ratings yet
rosp PPT
17 pages
Final Report Ai Application
No ratings yet
Final Report Ai Application
18 pages
Movie Recommendation System
No ratings yet
Movie Recommendation System
22 pages
Recommendation System in Python
No ratings yet
Recommendation System in Python
6 pages
Seminar Report
No ratings yet
Seminar Report
13 pages
Chatbot for banking Project Report - Phase - 1,2,3
No ratings yet
Chatbot for banking Project Report - Phase - 1,2,3
32 pages
Project Report MRS (1)
No ratings yet
Project Report MRS (1)
47 pages
Comprehensive Report
No ratings yet
Comprehensive Report
41 pages
Movie Recommdation Report
No ratings yet
Movie Recommdation Report
10 pages
Movie Recomendation: A Project Report o
No ratings yet
Movie Recomendation: A Project Report o
15 pages
Dsbda Report Final
No ratings yet
Dsbda Report Final
15 pages
Group 15 Report
No ratings yet
Group 15 Report
23 pages
Movix Project Report Final
No ratings yet
Movix Project Report Final
15 pages
Project Synopsis
No ratings yet
Project Synopsis
14 pages
Recommender System
No ratings yet
Recommender System
45 pages
NM (2)_merged
No ratings yet
NM (2)_merged
16 pages
Project Proposal
No ratings yet
Project Proposal
14 pages
smlPBL
No ratings yet
smlPBL
18 pages
NM (2)_merged_organized
No ratings yet
NM (2)_merged_organized
16 pages
ppt3_merged (1)
No ratings yet
ppt3_merged (1)
22 pages
Minor Presentation
No ratings yet
Minor Presentation
20 pages
Movie_Recommendation_Report
No ratings yet
Movie_Recommendation_Report
27 pages
Dr.B.C.Royengi Neeri Ngcollege: Academyofprofessi Onalcourses Durgapur
No ratings yet
Dr.B.C.Royengi Neeri Ngcollege: Academyofprofessi Onalcourses Durgapur
33 pages
Assignment 5
No ratings yet
Assignment 5
6 pages
Ai Final Project
No ratings yet
Ai Final Project
28 pages
Minor Project
No ratings yet
Minor Project
15 pages
Learning Powershell DSC: Get started with the fundamentals of PowerShell DSC and utilize its power to automate deployment and configuration of your servers
From Everand
Learning Powershell DSC: Get started with the fundamentals of PowerShell DSC and utilize its power to automate deployment and configuration of your servers
James Pogran
No ratings yet
Learning Continuous Integration with TeamCity
From Everand
Learning Continuous Integration with TeamCity
Manoj Mahalingam S
No ratings yet
Creating Stress vs. Strain Plots in Excel
No ratings yet
Creating Stress vs. Strain Plots in Excel
4 pages
EDC Notes
No ratings yet
EDC Notes
248 pages
Satchwell Product Catalog
No ratings yet
Satchwell Product Catalog
36 pages
Ws 1200 Manual
No ratings yet
Ws 1200 Manual
46 pages
Specimen 2016 QP - Paper 6 CIE Chemistry IGCSE
No ratings yet
Specimen 2016 QP - Paper 6 CIE Chemistry IGCSE
12 pages
CD 4094 Bms
No ratings yet
CD 4094 Bms
11 pages
W116-Vacuum-Climate - PNG (PNG Image, 600 × 776 Pixels) - Scaled (84%)
No ratings yet
W116-Vacuum-Climate - PNG (PNG Image, 600 × 776 Pixels) - Scaled (84%)
24 pages
1 Tủ Điều Khiển 2 Bơm Nước 3 Pha Khởi Động Trực Tiếp Có Chế Độ AutoManual Theo Level Switch
No ratings yet
1 Tủ Điều Khiển 2 Bơm Nước 3 Pha Khởi Động Trực Tiếp Có Chế Độ AutoManual Theo Level Switch
18 pages
Biometry Chap 1 To 4 For Biotech2023
No ratings yet
Biometry Chap 1 To 4 For Biotech2023
46 pages
Echeveria
No ratings yet
Echeveria
75 pages
Electronic Circuits
No ratings yet
Electronic Circuits
15 pages
Multiple Acess Techniques
No ratings yet
Multiple Acess Techniques
49 pages
CS 356 Cache Exercises: Redekopp Name: - Score
No ratings yet
CS 356 Cache Exercises: Redekopp Name: - Score
4 pages
CAT C-15 Operation & Maintenance
100% (3)
CAT C-15 Operation & Maintenance
125 pages
Analysis of Ribbed Slab
100% (1)
Analysis of Ribbed Slab
12 pages
Topic 3 Research Process
No ratings yet
Topic 3 Research Process
3 pages
Photonics in Warfare
No ratings yet
Photonics in Warfare
32 pages
Determination of Asphalt Layer Thickness Above Which Axle Load-Induced Strains Initiate Top-Down Cracking
No ratings yet
Determination of Asphalt Layer Thickness Above Which Axle Load-Induced Strains Initiate Top-Down Cracking
24 pages
Indicating: Bd-700 Aircraft Maintenance Manual - Part I
No ratings yet
Indicating: Bd-700 Aircraft Maintenance Manual - Part I
4 pages
Class 11 Chemistry Question Paper
No ratings yet
Class 11 Chemistry Question Paper
3 pages
Conditional, Wish, Hope (Theory)
No ratings yet
Conditional, Wish, Hope (Theory)
5 pages
Assignment # 1 - Model Answer
No ratings yet
Assignment # 1 - Model Answer
3 pages
Assignment - Probability and Random Variables
No ratings yet
Assignment - Probability and Random Variables
2 pages
Zennioptical PD Ruler PDF
No ratings yet
Zennioptical PD Ruler PDF
1 page
MAS04-06 - Standard Costing - MF - Encrypted
No ratings yet
MAS04-06 - Standard Costing - MF - Encrypted
9 pages
4 Bit Fast Adder Design Topology and Layout With Self Resetting Logic For Low Power VLSI Circuits 197 205
No ratings yet
4 Bit Fast Adder Design Topology and Layout With Self Resetting Logic For Low Power VLSI Circuits 197 205
9 pages
Ghid Utilizare DSK
No ratings yet
Ghid Utilizare DSK
6 pages
JSRRDA - PMU - 16 JMF Observations Palamu
No ratings yet
JSRRDA - PMU - 16 JMF Observations Palamu
2 pages
SCD Stage
No ratings yet
SCD Stage
11 pages
2015 Mathematics Solution
No ratings yet
2015 Mathematics Solution
24 pages

Movie Recommender Systems

Uploaded by

Movie Recommender Systems

Uploaded by

Project Title Movie Recommender Systems

Tools Jupyter Notebook and VS code

Technologies Data Science

Project Difficulties level intermediate

Click here to download data set

Movies Recommender System

Some of the things you can do with this dataset:

Movie Recommender System Machine Learning Project

# Load the datasets

# Display basic info and check for missing values

4. Exploratory Data Analysis (EDA)

import seaborn as sns

# Number of ratings per movie

5. Building the Recommender System

Collaborative Filtering using Matrix Factorization (SVD)

from surprise import Dataset, Reader, SVD

# Load the data into Surprise format

# Use SVD for matrix factorization

# Cross-validation to evaluate the algorithm

Training the Model

# Predict the rating for a specific user and movie

# Function to recommend top N movies for a given user

# Create a DataFrame of movie ids and predicted ratings

# Sort the DataFrame by predicted rating in descending order

# Get the top N recommended movies

# Merge with the movies DataFrame to get movie titles

# Recommend top 10 movies for user with ID 1

from flask import Flask, request, jsonify

9. Monitoring and Maintenance

10. Documentation and Reporting

Maintain comprehensive documentation of the project, including data sources, preprocessing

Tools and Technologies

● Programming Language: Python

Sample Project Report

You might also like