0% found this document useful (0 votes)
10 views

Phase-3 project

Uploaded by

samgeoj12d
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Phase-3 project

Uploaded by

samgeoj12d
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

NAME : Sajun Palraj

DEPARTMENT : COMPUTER SCIENCE AND


ENGINEERING

PROJECT : PERSONALIZED CONTENT


RECOMMENDATION (phase-3)
Phase-3: Document: Data Visualization

Personalized Content Recommendation

Introduction
In today's information age, users are bombarded with content. Recommender systems
address this challenge by filtering and suggesting content relevant to individual users'
preferences. This project explores how data visualization can enhance personalized content
recommendations.

Objective
The objective of this project is to develop a framework for personalized content
recommendation using data visualization techniques. This framework will leverage user data
to identify patterns and trends, enabling the creation of personalized recommendations
presented through effective data visualizations.
Dataset Description
The project will use a sample CSV dataset containing book rating data. This data might
include columns for:

● UserID:Uniqueidentifierforeachuser
● BookID:Uniqueidentifierforeachbook
● Rating:User'sratingforaspecificbook(e.g.,1-5stars)
● Genre:Genreofthebook(optional)
● Source
https://www.kaggle.com/datasets/zilmabezerra/book-recommendation-datasets.csv

Data Visualization Techniques


The project will employ various data visualization techniques to uncover user preferences
and content characteristics:

Univariate Visualization

Histogram :

Explore the distribution of user interactions (e.g., number of views per item)

Program:
import pandas as pd import

matplotlib.pyplot as plt
def plot_rating_distribution(data_file):

""" This function reads book rating data from a CSV file and creates a

histogram to visualize the distribution of ratings.

Args:

data_file (str): Path to the CSV file containing book rating data.

"""

# Load data data =

pd.read_csv(data_file)

# Create histogram for book ratings

plt.hist(data["Rating"])

plt.xlabel("Book Rating")

plt.ylabel("Number of Users")

plt.title("Distribution of Book Ratings")

plt.show()

Output:
Bar Graph:
Compare user interactions across different content categories

Program:
import pandas as pd import

matplotlib.pyplot as plt

def plot_average_rating_per_user(data_file):

data_file (str):

# Load data data =

pd.read_csv(data_file)

# Create bar graph for average rating per user avg_ratings =

data.groupby("User ID")["Rating"].mean().reset_index()

plt.bar(avg_ratings["User ID"], avg_ratings["Rating"]) plt.xlabel("User

ID") plt.ylabel("Average Book Rating") plt.title("Average Rating by

User") plt.xticks(rotation=45) #Rotatex-axislabelsforreadability


plt.show(
)
Output:

Bivariate Visualization

Scatter Plot:

Investigate relationships between user features and interaction types (e.g., age vs. number
of likes)

Program:

import pandas as pd import

matplotlib.pyplot as plt

def plot_rating_vs_author(data_file):

# Load data data =

pd.read_csv(data_file)

# Sample code assuming an 'Author' column exists

plt.scatter(data["Rating"], data["Author"])
plt.xlabel("Book Rating") plt.ylabel("Author")

plt.title("Relationship Between Rating and Author")

plt.show()

Output:

Box Plot:

Compare interaction distributions across different user demographics (e.g., views by ratings)
Program:

import seaborn as sns

import pandas as pd

def plot_rating_by_genre(data_file):

data_file (str):

# Load data data =

pd.read_csv(data_file)
# Create boxplot

sns.boxplot(

x = "Genre", # Replace with 'Genre' if available y

= "Rating", showmeans=True, data=data

) plt.xlabel("Book Genre") # Replace with 'Genre' if available

plt.ylabel("Book Rating") plt.title("Book Rating Distribution

by Genre") plt.xticks(rotation=45) #Rotatex-

axislabelsforreadability plt.show()

Output:

Multivariate Visualization

Pair Plot:

Analyze relationships between multiple user features and interactions.

Program:
import seaborn as sns

import pandas as pd

def create_pairplot(data_file):

data_file (str):

# Load data data =

pd.read_csv(data_file)

# Create pair plot (assuming all columns are numerical)

sns.pairplot(data) plt.show()

Output:

Heatmap:

Visualize correlations between content features and user interactions.


Program:
import seaborn as sns

import pandas as pd

def create_heatmap(data_file):.

data_file (str):

# Load data data =

pd.read_csv(data_file)

# Calculate correlation matrix

correlation = data.corr()

# Create heatmap

sns.heatmap(correlation)

plt.title("Correlation Heatmap")

plt.show()

Output:
Interactive Visualization

The project will incorporate interactive elements to allow users to explore recommendations
dynamically:
scatter plots:

with Brushing: Users can filter data points to focus on specific user segments
or content categories.

Program:
import pandas as pd from

plotly.graph_objs import Scatter

# Load data and prepare (replace with your data loading and cleaning)

data = pd.read_csv("book_ratings.csv") ratings = data["Rating"] genres =

data["Genre"] # Assuming a Genre column exists

# Create scatter plot with basic structure

plot = Scatter(

x=ratings,
y=genres, mode="markers",

marker=dict(size=10, color="blue", opacity=0.7)

# Define layout options (replace with interactive elements)

layout = dict(

title="Book Ratings by Genre",

xaxis_title="Rating",

yaxis_title="Genre",

# Combine plot and layout (replace with interactivity code)

fig = dict(data=[plot], layout=layout)

# Display the interactive plot (replace with deployment on a web server)

# plotly.offline.plot(fig, filename="interactive_scatter.html")
Output:
Interactive Dashboards:
Users can interact with dashboards to customize recommendations based on their
preferences.

Program:

# Import libraries (replace with specific choices)

import dash import dash_core_components as

dcc import dash_html_components as html

from dash.dependencies import Input, Output

import pandas as pd import plotly.express as px

# For visualizations

# Load and preprocess data (replace with your data loading)

data = pd.read_csv("book_ratings.csv") # ... (preprocessing

steps)

# Initialize Dash app app =

dash.Dash(__name__)

# Define layout with UI components and placeholders for visualizations

app.layout = html.Div([

html.H1("Book Recommendation Dashboard"),

dcc.Dropdown(

id="genre-filter", options=[{"label": genre, "value": genre} for genre in

data["Genre"].unique()],
value="All", #Defaultvalue

),

dcc.RangeSlider(

id="rating-range", min=data["Rating"].min(),

max=data["Rating"].max(), value=[data["Rating"].min(),

data["Rating"].max()], # Default range

), html.Div(id="visualization-container"), # Placeholder for

visualizations

])

# Define callback functions to update visualizations based on user interaction

@app.callback(

Output("visualization-container", "children"), [Input("genre-

filter", "value"), Input("rating-range", "value")],

) def update_visualization(genre,

rating_range):

filtered_data = data[data["Genre"] == genre] # Filter by genre

filtered_data = filtered_data[

(filtered_data["Rating"] >= rating_range[0]) & (filtered_data["Rating"] <= rating_range[1])

] #Filterbyratingrange

# Create visualizations here (replace with specific chart types and libraries)

scatter_plot = px.scatter(filtered_data)

Output:
Assumed Scenario
Imagine a music streaming service that utilizes this framework. By analyzing user listening
habits (interaction data), the system can recommend personalized playlists. Data
visualization techniques can help identify trends like:

● Genrespreferredbydifferentagegroups(univariatevisualization)
● Correlationbetweenlisteningtimeandmood(bivariatevisualization)
● Howuserdemographicsinfluenceplaylistpreferences(multivariatevisualization)

Conclusion
By leveraging data visualization, this project aims to create a personalized content
recommendation system that is not only effective but also user-friendly and engaging.
Through interactive visualizations, users can gain insights into their preferences and
discover new content they might enjoy.

You might also like