0% found this document useful (0 votes)
3 views

Seminar Report

Uploaded by

anuj27092004
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Seminar Report

Uploaded by

anuj27092004
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

A

Seminar Report
On

MACHINE LEARNING
Submitted in partial fulfilment
For the award of the degree of
Bachelor of Technology
In
ELECTRONICS AND COMMUNICATION ENGINEERING
(Rajasthan Technical University ,Kota)

SUBMITTED TO : SUBMITTED BY:

KRINA DAYANI Name: LAKSH


SARONJA
(Guest Faculty)
Roll No.: 23/296
URN: 23EUCEC027

DEPARTMENT OF ELECTRONICS ENGINEERING


RAJASTHAN TECHNICAL UNIVERSITY, KOTA
DECEMBER 2024
Abstract
The Movie Recommendation System is a Python-based project designed to enhance the
movie-watching experience by providing personalized recommendations. Utilizing
powerful data manipulation libraries like Pandas and NumPy, this system analyzes user
preferences, historical data, and movie features to deliver accurate and tailored suggestions.

The core objective of this project is to implement recommendation algorithms, such as


content-based filtering and collaborative filtering, which enable the system to predict
movies a user might enjoy. The dataset used includes movie metadata, ratings, and user
interactions, enabling the system to build robust relationships between movies and viewer
preferences.

By leveraging Python's flexibility and efficient computational tools, the project


demonstrates the practical application of machine learning concepts in real-world
scenarios. This system has the potential to be expanded for commercial applications,
integrating additional data sources, and improving recommendation accuracy through deep
learning techniques.

This project was developed as part of an internship program at YBI Foundation, showcasing
the integration of theoretical learning and hands-on experience in software development
and data science.
ACKNOWLEDGEMENT
I would like to express my gratitude for the people who were part of my report, directly or
indirectly people who gave unending support right from the stage the idea was conceived. It
gives me a great pleasure to have an opportunity to acknowledge and to express gratitude
to those who were associated with me during my Internship at YBI Foundation.

I take this opportunity to thank the industrial training coordinator, H.O.D of Computer
Science and Engineering department. I am highly indebted to my project guide Dr. Alok
Yadav (Training Instructor) for his guidance and words of wisdom. He always showed me
the right direction during the course of this report project work. I am duly thankful to him
for teaching and referring me to various blocks, providing work, and for permitting me to
have training of duration of 4 weeks.
Movie Recommendation System Using
Python

Contents

1 Objective 3

2 Internship Experience 4

2.1 YBI Foundation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Role in the Internship . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.3 Skills Acquired . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3 Technologies and Tools Used 5

3.1 Python Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.2 Libraries Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4 Dataset Details 6

4.1 Source of the Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4.2 Structure of the Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4.3 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5 Steps Undertaken 7

5.1 Data Import and Exploration . . . . . . . . . . . . . . . . . . . . . . . . 7

5.2 Data Cleaning and Preprocessing . . . . . . . . . . . . . . . . . . . . . . 7

5.3 Data Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1
6 Recommendation Algorithm 8

6.1 Why Collaborative Filtering? . . . . . . . . . . . . . . . . . . . . . . . . 8

6.2 SVD Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7 Model Evaluation 8

7.1 Metrics Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

8 Prediction and Results 9

8.1 Top Recommendations for User X: ........................................................................ 9

8.2 Strengths and Limitations ...................................................................................... 9

9 Conclusion and Future Scope 10

9.1 Future Improvements ........................................................................................... 10

10 References 10

2
1 Objective

The objective of this industrial training was to develop a project on


a Movie Recommendation System capable of suggesting
personalized movie recommendations to users. With the growing use
of online streaming platforms, personalized content
recommendations have become a core component of enhancing user
engagement. By analyzing user ratings and leveraging advanced
recommendation algorithms, this system attempts to predict and
recommend movies that align with the users’ tastes.
This project was completed as part of a 15-day internship at YBI
Foundation, where the focus was on learning Python
programming, mastering data manipulation techniques, and
implementing real-world machine learning models.

3
2 Internship Experience

2.1 YBI Foundation Overview

The YBI Foundation offers short-term internships focusing on building foundational and
advanced skills in Python programming and data science. Over the 15-day internship,
participants were introduced to a range of technologies essential for data-driven projects.

2.2 Role in the Internship

During my internship, I was tasked with implementing a real-world recommendation


system. My primary responsibilities included:

• Data Analysis: Understanding the dataset, cleaning the data, and preparing it
for modeling.

• Model Building: Applying collaborative filtering algorithms for recommenda-


tions.

• Visualization: Using data visualization tools to identify patterns and trends.


• Evaluation: Testing the model’s accuracy using statistical metrics like RMSE and
MAE.

2.3 Skills Acquired

Technical Skills:

• Proficiency in Python libraries such as Pandas, NumPy, and Matplotlib.


• Familiarity with advanced algorithms like Singular Value Decomposition (SVD).

Professional Development:

• Improved problem-solving skills.


• Gained experience in documenting and presenting technical work.

4
3 Technologies and Tools Used

3.1 Python Programming

Python is a versatile programming language widely used in data science and


machine learning. It provides robust libraries for data manipulation,
statistical analysis, and machine learning.

3.2 Libraries Used


1. Pandas: A powerful library for data manipulation. Used for importing
datasets, cleaning missing data, and reshaping data structures.
2. NumPy: Efficient for numerical computations, enabling operations on
large arrays and matrices.
3. Matplotlib and Seaborn: Visualization libraries used to create
plots, graphs, and heatmaps. Helped understand rating distributions
and user preferences.
4. Surprise: A Python library specialized in recommendation systems.
Used to im- plement collaborative filtering techniques like SVD.
5. Scikit-learn: Essential for train-test splitting and model evaluation.

5
4 Dataset Details

4.1 Source of the Dataset

The dataset used in this project is the MovieLens dataset, a popular benchmark in the
recommendation systems domain. It contains user ratings for movies spanning multiple
genres.

4.2 Structure of the Dataset


• Movies.csv: Columns: Movie ID, Title, Genre.
Example: Movie ID 1 → Toy Story (1995) → Genre: Animation, Children, Adven-
ture.

• Ratings.csv: Columns: User ID, Movie ID, Rating, Timestamp.


Example: User 5 rated Movie ID 1 with 4.5 stars.

• Links.csv (Optional): Provides metadata such as IMDb links.

4.3 Statistics
• Total Movies: Over 10,000
• Total Users: 100,000+
• Number of Ratings: 1,000,000+

6
5 Steps Undertaken

5.1 Data Import and Exploration

The first step was importing the data into the Python environment using Pandas. The
datasets were loaded using pd.read csv() and explored for missing values and duplicates.
Example:

import pandas as pd
movies = pd.read_csv(’movies.csv’)
ratings = pd.read_csv(’ratings.csv’)

5.2 Data Cleaning and Preprocessing


• Missing Data Handling: Removed rows with null values to avoid inconsistencies.
• Feature Engineering: Encoded genres into numerical format for easier analysis.
• Matrix Construction: Created a user-item matrix where rows represent users
and columns represent movies.

5.3 Data Visualization

Visualization was performed to understand patterns.

• Rating Distribution: Plotted a histogram to show the frequency of ratings.


• Most Popular Movies: Identified movies with the highest number of ratings.

import matplotlib.pyplot as plt


ratings[’rating’].hist(bins=5)
plt.title(’Rating Distribution’)
plt.show()

7
6 Recommendation Algorithm

6.1 Why Collaborative Filtering?

This technique predicts user preferences by identifying patterns in user behavior.

6.2 SVD Implementation

The Surprise library’s SVD (Singular Value Decomposition) was used to build the rec-
ommendation model.

Steps Taken:

1. Data Preparation: Loaded data into the Surprise library format.

2. Apply Algorithm: Trained the model using SVD.

3. Generate Predictions: Predicted user ratings for unseen movies.

7 Model Evaluation

7.1 Metrics Used


• RMSE (Root Mean Squared Error): Measures prediction accuracy.
• MAE (Mean Absolute Error): Evaluates average prediction error.

7.2 Performance

The RMSE of the model was 0.87, indicating good prediction accuracy.

8
8 Prediction and Results

8.1 Top Recommendations for User X:


• Inception (2010): 4.8
• The Dark Knight (2008): 4.7
• Interstellar (2014): 4.6

8.2 Strengths and Limitations

Strengths:

• Highly personalized.
• Efficient for large datasets.

Limitations:

• Struggles with new users or movies (cold start


problem).
• Biased toward popular items.

9
9 Conclusion and Future Scope

The Movie Recommendation System successfully provided


personalized suggestions.

9.1 Future Improvements


• Incorporate additional features like timestamps or user
demographics.
• Explore hybrid approaches combining collaborative and
content-based filtering.

10 References
• MovieLens Dataset:
https://grouplens.org/datasets/movielens/
• Surprise Library Documentation:
https://surprise.readthedocs.io/en/stable/
• Scikit-learn Official Guide: https://scikit-learn.org/

10

You might also like