0% found this document useful (0 votes)
5 views12 pages

V11I2-1171

The document presents the AI-Driven VibeBox, an adaptive music streaming system that personalizes music recommendations based on user emotions through advanced AI techniques such as sentiment analysis and facial emotion recognition. It highlights the system's architecture, methodology, and the integration of various technologies to enhance user experience in music discovery. The paper also discusses the challenges faced by existing music recommendation systems and how MoodSync Vibebox addresses these issues to provide a more tailored listening experience.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

V11I2-1171

The document presents the AI-Driven VibeBox, an adaptive music streaming system that personalizes music recommendations based on user emotions through advanced AI techniques such as sentiment analysis and facial emotion recognition. It highlights the system's architecture, methodology, and the integration of various technologies to enhance user experience in music discovery. The paper also discusses the challenges faced by existing music recommendation systems and how MoodSync Vibebox addresses these issues to provide a more tailored listening experience.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Dr. M.K. Jayanthi Kannan et. al.

, International Journal of Advance Research, Ideas and Innovations in Technology


(ISSN: 2454-132X)

ISSN: 2454-132X
Impact Factor: 6.078
(Volume 11, Issue 2 - V11I2-1171)
Available online at: https://www.ijariit.com
AI-Driven Vibebox: Adaptive Music Streaming Personalized
Based on Emotion
Dr. M.K. Jayanthi Kannan
Anirudh Kanwar Harsh Chaturvedi
[email protected]
[email protected] [email protected]
VIT Bhopal University, Bhopal-
VIT Bhopal University, Bhopal-Indore VIT Bhopal University, Bhopal-Indore
Indore Highway, Kothrikalan,
Highway, Kothrikalan, Sehore, Highway, Kothrikalan, Sehore, Madhya
Sehore, Madhya Pradesh -
Madhya Pradesh - 466114 Pradesh - 466114
466114
Aditya R Patil
Abhimaan Yadav
[email protected]
[email protected]
VIT Bhopal University, Bhopal-
VIT Bhopal University, Bhopal-Indore
Indore Highway, Kothrikalan,
Highway, Kothrikalan, Sehore,
Sehore, Madhya Pradesh -
Madhya Pradesh - 466114
466114

ABSTRACT
Music recommendation systems play a crucial role in addressing the challenges of information overload and personalization in the
digital music landscape. This paper presents the implementation and contribution of a novel music recommendation system that aims
to enhance the user experience and overcome the limitations of existing approaches. The AI-Driven VibeBox: Adaptive Music
Streaming personalized based on Emotion provides an overview of the project's architecture, methodology, and key findings,
highlighting its contributions to music recommendation systems. The exponential growth of digital music platforms has led to an
overwhelming abundance of music content, making it increasingly difficult for users to discover and explore new music that aligns
with their preferences. Music recommendation systems have emerged as a vital tool to address this challenge, leveraging various
techniques to provide personalized suggestions and enhance the user experience. MoodSync Vibebox is a music recommendation
system that seeks to advance the state-of-the-art in this domain. This review paper aims to critically analyze the project's methodology,
findings, and contributions, while also situating it within the broader context of music recommendation system research.
Keywords: Artificial Intelligence, ML, Intelligent Automatic Playlist Generation, Text Analysis, Fine-tuned BERT model, Sentiments
and Emotions Recognition, Facial Emotion Analysis, CNN, Playlist Generation Model, React.js, Firebase, Web Application
Development.

INTRODUCTION
The increasing popularity of music streaming services has greatly affected how users listen to and interact with music. Still, while a lot
of the services have good recommendation systems, they are mostly based on the prior usage of the user, and this does not consider how
the user feels at the moment. It is well known that music is personal and is very effective when one needs to vent or boost themselves
with certain feelings. Therefore, a real-time understanding of one’s emotions will help to create a more customized approach to delivering
music. Because of the time constraints, sophisticated however typical approaches to constructing play-lists as just that plays the items do
not help to solve this problem. Therefore, adding the ability to recognize a user’s emotion into music systems together with the dynamic
recognition of a user’s emotion for producing a set of songs allows to better meet the music needs of a user.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 61
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
This paper proposes a combined solution aiming to address this problem using the latest artificial intelligence technology that deals with
natural texts and images for emotion recognition and selecting the appropriate songs based on the user’s moods. The proposed system
employs state-of-the-art techniques such as BERT for text sentiment analysis, CNNs for facial emotion recognition, and a hybrid
recommendation system for song recommendation. All these are smoothly incorporated in a web-based app made with modern platforms
such as React.js, Vite, and Tailwind CSS, with Firebase being the backend. The primary objective of this paper is to develop and evaluate
an intelligent music recommendation system capable of, Accurately Detecting User Moods: Using AI techniques such as sentiment
analysis, facial recognition, and contextual inputs. Creating Adaptive Playlists: Generating dynamic playlists that align with the user's
current emotional state or activity. Enhancing Music Discovery: Introducing users to diverse music that resonates with their vibe while
expanding their listening preferences. Improving User Engagement: Offering an emotionally responsive and personalized listening
experience to foster long-term user satisfaction.

LITERATURE REVIEW AND DOMAIN ANALYSIS


Music recommendation systems have evolved over the years, with researchers and practitioners exploring a range of techniques to improve
the accuracy and personalization of recommendations. Collaborative filtering approaches leverage user-item interaction data to identify
similar users and make recommendations based on their preferences. Content-based filtering methods analyze the inherent characteristics
of music, such as audio features and metadata, to suggest items with similar properties. Hybrid approaches utilize these techniques to leverage
the strengths of both and overcome their individual limitations. Despite the advancements in the field, music recommendation systems still
face several challenges, including cold-start problems, data sparsity, and scalability issues. The need for innovative approaches that can address
these challenges and enhance the user experience has motivated the development of MoodSync Vibebox. The current mental state of the person
is provided by facial expressions. Most of the time we use nonverbal clues like hand gestures, facial expressions, and tone of voice to express
feelings in interpersonal communication. Preema et al stated that it is very time-consuming and difficult to create and manage a large playlist. The
paper states that the music player itself selects a song according to the current mood of the user. The application scans and classifies the audio
files according to audio features to produce mood-based playlists. The application makes use of the Viola-Jonas algorithm that is used for face
detection and facial expression extraction. However, the prevailing algorithms are slow, increase the overall cost of the system by using additional
hardware (e.g., EEG structures and sensors) and feature much less accuracy. The paper presents an algorithm that automatically does the process
of generating a playlist of audio, based on the facial expressions of a person, for rendering salvage of time as well as labor, invested in performing
this process manually. The algorithm given in the paper directs at reducing the overall computational time and the cost of the designed system.
It additionally aims at improving the accuracy of the system design. The system's facial expression recognition module is validated by comparing
it to a dataset that is both user-dependent and user-impartial.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 62
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
Figure 1: A Flow Chart of the Project AI-Driven VibeBox: Adaptive Music Streaming
FUNCTIONAL MODULES IMPLEMENTATION OF MOODSYNC VIBEBOX: AI-CURATED PLAYLIST
The proposed system benefits us by presenting interaction between the user and the music player. The purpose of the system is to capture the
face properly with the camera. Captured images are fed into the Convolutional Neural Network which predicts emotion. Then emotion
derived from the captured image is used to get a playlist of songs. The main aim of our proposed system is to provide a music playlist automatically
to change the user's moods, which can be happy, sad, natural, or surprising. The proposed system detects the emotions, if the topic features a negative
emotion, then a selected playlist is going to be presented that contains the most suitable sorts of music that will enhance the mood of the person
positively. Music recommendation based on facial emotion recognition contains four modules. Real-Time Capture: In this module, the
system is to capture the face of the user correctly. Face Recognition: Here it will take the user's face as input. The convolutional neural network
is programmed to evaluate the features of the user image. Emotion Detection: In this section extraction of the features of the user image is done to
detect the emotion and depending on the user's emotions, the system will generate captions. Music Recommendation: Song is suggested
by the recommendation module to the user by mapping their emotions to the mood type of the song.

Figures 2 and 3: AI-Driven VibeBox: Adaptive Music Streaming Flow Chart and Components Diagram

MoodSync Vibebox is designed with a robust architecture that integrates multiple components to provide a comprehensive music
recommendation solution. The system leverages a diverse set of data sources, including user listening history, audio features, and contextual
information, to build a personalized user profile and generate relevant recommendations. We built the Convolutional Neural Network
model using the Kaggle dataset. The database is FER2013 which is split into two parts training and testing dataset. The training dataset consists
of 24176 and the testing dataset contains 6043 images. There are 48x48 pixel grayscale images of faces in the dataset. Each image in FER-2013 is
labeled as one of five emotions: happy, sad, angry, surprised, and neutral. The images in FER-2013 contain both posed and unposed
headshots, which are in grayscale and 48x48 pixels. The project's methodology involves a multi-pronged approach, as outlined in the following
sections: Data Collection: The system collects user listening data from various music streaming platforms, as well as audio features and
metadata from music databases. This data is used to create user profiles and build the recommendation engine. Algorithm Development:
MoodSync Vibebox employs a hybrid recommendation algorithm that combines collaborative filtering and content-based filtering techniques. The
algorithm incorporates novel approaches to address the challenges of cold-start and data sparsity, aiming to provide accurate and personalized
recommendations. User Testing: The project's user testing phase involves a comprehensive evaluation of the system's performance, including
metrics such as precision, recall, and user satisfaction. The feedback gathered from user studies is used to refine the recommendation algorithm
and improve the overall user experience.

DEVELOPING MOODSYNC VIBEBOX REQUIREMENT ARTIFACTS


The success of any AI-driven system depends on well-defined requirements and artifacts that guide its development. MoodSync
VibeBox requires a combination of hardware, software, and AI models to deliver real-time, emotion-based music
recommendations. This chapter outlines the functional and non-functional requirements, system specifications, and development tools
necessary for implementing MoodSync VibeBox. Additionally, it discusses the essential artifacts such as system architecture diagrams,
data flow models, and UI prototypes, which help visualize and structure the system’s design. By establishing these requirements, we
ensure a seamless, efficient, and scalable AI-driven music recommendation experience. Hardware and Software Requirements To
develop and deploy MoodSync VibeBox, a combination of hardware and software components is essential. These ensure seamless real-
time emotion recognition, AI-based music recommendations, and user interaction.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 63
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
Hardware Requirements: Processor: Intel Core i5/i7 (or AMD equivalent) with AI acceleration support. RAM: Minimum 8GB
(16GB recommended for smooth AI model execution). Storage: SSD with at least 256GB for storing datasets and model
weights. Webcam/Sensors: High-resolution webcam or infrared camera for facial emotion recognition. Microphone: High-quality
audio input device for voice-based emotion detection. Software Requirements: Operating System: Windows 10/11, macOS, or Linux
(Ubuntu preferred for AI development). Programming Language and Frameworks, python – Primary language for AI development
and backend processing. TensorFlow/PyTorch – Machine learning frameworks for emotion detection models. OpenCV – For facial
recognition and emotion analysis. NLTK/Text Blob – For sentiment analysis in voice/text-based inputs. Databases & Cloud
Storage: MongoDB/MySQL – Storing user preferences and emotion data. Firebase/AWS S3 – Cloud storage for large datasets and
AI models. User Interface & Web Technologies, React.js/Flutter – For building an intuitive user-friendly interface. Flask/Django –
Backend framework to handle API requests and AI processing. APIs & Libraries: Spotify API, Gemini cloud vision API, Microsoft
Azure Emotion API, Twilio Api, Testing Tools: Postman, Jest for unit and integration testing

Specific Project Requirements include Data Requirements like Oscar AI requires structured and unstructured data for accurate speech
processing and text enhancement. User input data: Real-time speech data and typed text. Custom vocabulary: Industry-specific terms
added by users. Transcription history: Stored securely for user reference. AI learning dataset: Past corrections used for continuous
improvement. Functional Requirement Real-time speech-to-text conversion with high accuracy. Emotion detection and AI-driven
rewording for improved readability. Support for multiple accents and dialects. Cloud-based and offline processing for flexibility.
Customizable settings including user-defined vocabulary and theme selection.

PERFORMANCE AND SECURITY REQUIREMENT AI-DRIVEN VIBEBOX: ADAPTIVE MUSIC


STREAMING
Performance, Low latency transcription processing, Scalability to handle increased user demand, and Efficient resource utilization to
balance accuracy and speed. Security, OAuth 2.0 authentication for user login. SSL/TLS encryption for secure data transmission. Access
control to restrict unauthorized modifications. Look and Feel Requirement, Minimalist UI: Ensuring ease of navigation for all users. Dark
and light themes: Customizable for user preference. Real-time song lyrics for instant better UI. Adaptive layout: Optimized for various
screen sizes and orientations. The essential hardware, software, APIs, libraries, and testing tools required for the development of
MoodSync VibeBox. The system integrates AI-driven emotion recognition with real-time music recommendations, requiring powerful
processors, biometric sensors, and cloud-based databases. Key APIs such as Spotify, Google Cloud Vision, and IBM Watson Tone
Analyzer enable seamless data processing. AI libraries like TensorFlow, OpenCV, and NLTK support emotion detection, while Flask
and React.js handle backend and UI development. Rigorous testing with Postman, Selenium, and PyTest ensures system accuracy and
performance. These components collectively establish a robust, scalable, and intelligent music recommendation system.
5.1 The Innovative Design Methodology And Its Novelty
The design methodology of MoodSync VibeBox focuses on developing a real-time AI-driven music recommendation system that adapts
dynamically to a user’s emotional state. Unlike traditional methods that rely solely on listening history, this approach integrates facial
expression analysis, voice sentiment recognition, and biometric feedback to enhance personalization. The primary goal of this
methodology is to create a highly adaptive and personalized music recommendation system using advanced AI techniques. The
MoodSync VibeBox methodology consists of, Captures facial expressions, voice tone, and biometric signals using OpenCV, NLTK, and
TensorFlow. Scalable architecture ensuring low-latency processing. Strong security measures to protect user data. A user-friendly
interface for smooth interaction across devices.

5.2 Functional Modules Design and Analysis


Moodsync Vibebox consists of several interdependent functional modules, each responsible for specific tasks, Captures user emotions
through facial expressions, voice tone, and biometric signals. Uses OpenCV for facial analysis and NLTK/TextBlob for voice-based
sentiment recognition. User Customization Module – Allows users to set preferences, add custom vocabulary, and adjust correction
intensity. Backend Processing Module – Manages requests, stores transcription history, and handles authentication. UI/UX Module –
Ensures a clean and interactive interface with real-time feedback. Each module is designed for high performance, ensuring smooth
processing and minimal latency. Software Architecture Designs, The system follows a client-server architecture, ensuring efficient
interaction between the frontend and back end. Key components include Frontend (Flutter): Which handles user input, displays
transcriptions, and allows user interaction. Backend (Node.js, Express.js): Processes requests, connects to AI models and ensures secure
data handling. Database (MongoDB, Firebase): Stores user preferences, custom vocabulary, and past transcriptions. Emotion Detection
Module: Uses OpenCV, TensorFlow, and NLTK to analyze facial expressions, voice tone, and biometric data. This modular structure
ensures scalability, flexibility, and maintainability.

5.3 Subsystem Services AI-Driven VibeBox: Adaptive Music Streaming


Oscar AI incorporates several key subsystems to enhance functionality: Music recommendation Subsystem – Fetches songs based on
mood using Spotify API / Apple Music API. AI Refinement Subsystem – Improves text clarity and correctness. Authentication Subsystem
– Ensures secure user logins using OAuth 2.0. Data Storage Subsystem – Manages user history and preferences securely. Each subsystem
interacts seamlessly, ensuring efficient performance and user satisfaction. User Interface Designs, the UI is designed with a focus on
simplicity, clarity, and ease of use, incorporating, Minimalist design – Prioritizing key functions like transcription, correction, and
customization. Real-time feedback – Errors and AI suggestions appear instantly. Adaptive layouts – Responsive design ensures smooth
use across devices. Dark and light themes – Customization for user preference. Performance and Security Measures, Performance
Enhancements, Optimized speech recognition for fast transcription.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 64
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
Efficient API integration to reduce processing delays. Load testing to ensure scalability during peak usage. Security Implementations,
OAuth 2.0 authentication for secure login. SSL/TLS encryption to protect data transmission. Role-based access control to prevent
unauthorized modifications. These essential requirements, architectural design, and subsystem services of MoodSync VibeBox. The
system relies on a layered software architecture, integrating emotion detection, data processing, AI-driven music recommendations, user
feedback mechanisms, and a responsive UI. Key subsystems include facial and voice emotion analysis, AI-based mood classification,
real-time playlist adaptation, and backend data management. By utilizing advanced APIs, machine learning models, and scalable cloud-
based services, MoodSync VibeBox ensures a personalized, adaptive, and real-time music experience, setting it apart from conventional
recommendation systems.

TECHNOLOGICAL CONFIGURATION IMPLEMENTATION AND ANALYSIS

The technical implementation of MoodSync VibeBox, detailing its system architecture, emotion recognition via multi-modal AI,
contextual intelligence integration, and dynamic vibe state modeling. It covers playlist generation logic, user feedback loops, scalability
strategies, and data privacy measures, highlighting the platform’s innovative, adaptive, and secure design. Technical Coding and Code
Solutions, Voultbox AI’s implementation is built using a Flutter frontend with a Node.js and Express.js backend, integrated with Gemini
AI for advanced text refinement. The system uses, Facial Emotion Recognition (CNN-based). Text Sentiment Analysis (NLP-based).
Voice Emotion Classification (Spectrogram + RNN). Contextual State Detection, Vibe State Mapping, Playlist Generation via Spotify
API. Key Code Optimizations, Asynchronous processing for seamless real-time transcription. Efficient API calls to minimize response
time. Error handling mechanisms to ensure robust performance.

Working Layout of Forms ensures the effective technical implementation and validation of the MoodSync VibeBox system, a
structured set of working forms has been developed. These forms are designed to support the testing, verification, and refinement of
system components including emotion recognition, context detection, vibe mapping, playlist generation, and user feedback integration.
Emotion Recognition Test Log: This form records test instances for facial, textual, and vocal emotion recognition modules. It tracks the
model performance, predicted outcomes, and confidence levels. Contextual State Capture Record: Captures real-time contextual data
including location, weather, and time. This is used to verify the accuracy of contextual state modeling. Vibe State Mapping Verification:
Used to evaluate the mapping accuracy between emotional and contextual inputs and the system-defined "vibe states." Playlist
Generation Evaluation: Monitors playlist effectiveness based on user interaction metrics such as engagement duration, liked songs, and
skip rates. Input text provides webcam access and facial data are processed through the mood recognition models.

Figure 4: Workflow Diagram of the Project AI-Driven VibeBox: Adaptive Music Streaming.

Detected Mood Triggers of the playlist Generation Model:Implemented a timer to track recording duration.

The recommended playlist is displayed in real – time on the user interface:

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 65
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)

Figure 5: AI-Driven VibeBox: Adaptive Music Streaming Visual Functioning of the project

Figure 6 : Recommended UI AI-Driven VibeBox: Adaptive Music Streaming

Figure: Prototype, Algorithm, Program Logic Implementing MoodSync VibeBox, AI – Curated Playlist -

Figure : The Implementation of Vibe box, Automatic Playlist based on user Mood in real-time:

Figure : The recommended Working Prototype Vibebox


Configured the application’s core structure using a Stateless widget for initialization. Implemented a transcription results screen to view,
edit, and share transcribed text. Keyboard Code, Linear Layout (Layout): Acts as the root layout with vertical orientation and padding.
Relative Layout (relative Layout): Used to position the micIcon and textInputLayout properly.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 66
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
WORKING PROTOTYPE MOODSYNC VIBEBOX: AI – CURATED PLAYLIST
The Playlist Creation Logic Voultbox: Automatic Playlist Curation using AI is Tested and Validated. The test and validation phase is
crucial to ensure the functional accuracy, performance efficiency, and reliability of the MoodSync VibeBox system. Various modules of
the system—emotion recognition, contextual intelligence, vibe mapping, and playlist generation—were rigorously tested under different
real-world scenarios to validate their accuracy, responsiveness, and adaptability. Testing Methods: Emotion Recognition System
Objective: To verify the model’s ability to detect human emotions accurately from multimodal inputs (facial expressions, voice, and
text). Approach: Tested using a dataset of labeled facial images, voice recordings, and sentiment-tagged text inputs. Results, CNN
model for facial emotion achieved ~90% accuracy. NLP sentiment classifier using transformer pipeline yielded 92% F1-score on
validation data. RNN-based voice model showed improved classification after augmentation with diverse vocal tones. Contextual State
Detection, Objective: Ensure accurate extraction of contextual metadata (time, weather, location). Approach: Tested with real-time
API data and manual cross-validation. Results: 100% accuracy in time and location capture; weather API data was found to be 98%
reliable during testing intervals. Playlist Generation and Recommendation Engine, Objective: Ensure playlists match intended vibes
and maximize user satisfaction. Approach: Generated playlists were evaluated through user testing and behavioral metrics (skips, likes,
engagement time). Results: Average playlist satisfaction score: 4.3/5. Skip rate: <15%. Users responded positively to emotional-vibe
matching.

Figure 7 : Functionality Criteria’s of the AI-Driven VibeBox: Adaptive Music Streaming.

7.1 The Performance Analysis AI-Driven VibeBox: Adaptive Music Streaming

Voutbox AI’s performance was analyzed based on, Speed: Average response time for speech-to-text processing and AI
enhancements. Accuracy: Percentage of correct transcriptions and grammar corrections. Scalability: Ability to handle multiple users
simultaneously without delays. Security: Effectiveness of encryption and authentication mechanisms. Improvements were made to
optimize AI response time and database queries, ensuring smooth, real-time processing for users. The technical foundation and
implementation strategies behind the MoodSync VibeBox system. It details the development and integration of core modules such as
facial, voice, and text-based emotion recognition; contextual state detection through real-time environmental data; and the mapping of
these insights into personalized "vibe states." The curated playlists are generated using a recommendation engine powered by AI models
and enhanced by user feedback via a reinforcement loop. The chapter also includes a working layout of forms used for data tracking and
testing, a comprehensive validation matrix to evaluate system performance, and flowcharts illustrating the data pipeline. Overall, the
technical framework demonstrates high modular accuracy, smooth integration, and adaptability based on user behavior.

7.2 Working Prototype Outcome and Usability Testing AI-driven Vibebox: Adaptive Music Streaming

Key Implementation and System Overview ensures the MoodSync VibeBox system’s usability, it was successfully implemented
through the integration of multi-modal artificial intelligence and real-time contextual awareness. The key technical implementations are
summarized below, Emotion Recognition Engine: Leveraging convolutional neural networks (CNN) for facial expression analysis,
recurrent neural networks (RNN) for voice emotion recognition, and transformer-based NLP models for text sentiment analysis. These
modules provided high accuracy and consistency across varied input modalities. Contextual State Analysis: Real-time integration of
external data sources (weather, location, time) enabled the system to enrich emotion recognition with situational awareness, leading to
more relevant and personalized recommendations.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 67
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
Vibe State Mapping Logic: A rule-based ensemble method was used to combine emotional and contextual data to determine the user's
current "vibe" or mood state. This was central to curating playlists that resonated with the user's real-time emotional state. AI-Powered
Playlist Generation: The recommendation engine was built using mood-tagged song datasets, optimized by user feedback and a
reinforcement learning loop to refine future playlists. User Feedback Loop: Interaction metrics (e.g., skips, likes, session duration) were
tracked and used to adaptively improve personalization over time.

System Overview, The MoodSync VibeBox system architecture comprises several modular components working cohesively. Input
Acquisition Layer Captures facial expressions, voice samples, and text input from the user interface. Emotion Processing Layer Applies
machine learning models to derive emotional states from multimodal inputs. Contextual Intelligence Layer, Retrieves real-time
environmental data to construct contextual understanding. Vibe Fusion Layer, Merges emotional and contextual data to determine the
dominant vibe. Recommendation Engine, Generates a playlist tailored to the detected vibe using Spotify API and internal mood-tagged
libraries. Feedback and Adaptation Layer, Learns from user behavior and iteratively enhances the model’s predictive
capability. Significant Project Outcomes, The implementation of Oscar AI has resulted in multiple impactful outcomes, including:,
Enhanced Transcription Accuracy: AI-powered grammar correction improves sentence structure and readability. Time Efficiency: Real-
time processing ensures faster note-taking in meetings, lectures, and professional settings. User-Centric Adaptability: The system learns
from user corrections, improving refinement over time. Security and Reliability: Strong authentication and encryption techniques protect
user data. Scalability for Large-Scale Use: Backend optimizations ensure smooth performance under high user load.

FINDINGS AND RECOMMENDATIONS AI-DRIVEN VIBEBOX: ADAPTIVE MUSIC STREAMING


The development and implementation of the MoodSync VibeBox system demonstrate the feasibility and effectiveness of integrating
emotional intelligence with AI-driven music recommendation. By accurately recognizing user emotions and contextual data, the system
delivers personalized, real-time playlists that enhance user experience. The successful validation of each module confirms the system's
technical reliability, while its modular design ensures scalability and real-world adaptability. Overall, the project lays a strong foundation
for future advancements in emotionally responsive media technologies. The MoodSync VibeBox project summarizes its achievements,
identifying current system limitations, and presenting recommendations for future improvements. The system successfully combines
emotion recognition with contextual awareness to generate AI-curated playlists tailored to a user's vibe. Following a modular
implementation and extensive validation, the platform demonstrates strong real-world applicability. However, like any emerging
technology, it also faces constraints that offer avenues for enhancement and innovation. Project Applicability in Real-world Scenario,
Voultbox is designed for wide-scale applicability across various domains, such as, Mental Health: Mood-based music for emotional
support and stress relief. Smart Homes: Adaptive ambiance through mood-aware music control. Automotive: Enhances driving
experience with emotion-responsive playlists. Fitness Apps: Boosts workouts or relaxation with vibe-aligned tracks. Retail &
Hospitality: Improves atmosphere with mood-driven background music. Music Streaming: Offers smarter, personalized
recommendations.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 68
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)

Figure 8 and 9. The Playlist Creation Logic AI-driven Vibebox: Adaptive Music Streaming based on Emotions

Figure 10. The current emotional Mood Sync playlist classification AI-driven Vibebox

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 69
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)

Figure 11: AI-driven Vibebox: Adaptive Music Streaming AI-driven Vibebox

The evaluation of our project has yielded promising results, demonstrating its effectiveness in providing personalized music recommendations. The
system has achieved 60% precision and recall in studies, outperforming benchmark recommendation systems. The project's hybrid
recommendation approach has proven successful in addressing the cold-start and data sparsity challenges, enabling the system to provide
accurate suggestions even for users with limited listening history or for newly released music. The incorporation of contextual information, such
as mood and activity, has also contributed to the system's ability to deliver personalized and relevant recommendations. The findings of our
project align with the broader trends and advancements in the field of music recommendation systems.

CONTRIBUTION AND FINDINGS OF AI-DRIVEN VIBEBOX: ADAPTIVE MUSIC STREAMING


While our project has shown promising results, there are several avenues for further research and improvement. Incorporating deep
learning techniques, such as neural network-based recommendation models, may enhance the system's ability to capture complex patterns
and relationships within the music data. Additionally, expanding the data sources to include multi-modal information, such as social media
interactions and user contextual data, could potentially lead to even more personalized and engaging recommendations. Exploring the
scalability and adaptability of the project's architecture will also be crucial, as music recommendation systems need to accommodate the ever-
growing music catalogs and evolving user preferences. Investigating ways to seamlessly integrate MoodSync Vibebox with various
music platforms and services could further this system, although completely functioning, does have scope for improvement in the
future. There are various aspects of the application that can be modified to produce better results and a smoother overall experience for the user.
Some of these that an alternative method, based on additional emotions that are excluded in our system as disgust and fear. This emotion included
supporting the playing of music automatically. The future scope within the system would style a mechanism that might be helpful in music
therapy treatment and help the music therapist treat patients suffering from mental stress, anxiety, acute depression, and trauma. The current
system does not perform well in extremely bad light conditions and poor camera resolution thereby providing an opportunity to add some
functionality as a solution in the future.

REFERENCES
[1] Ramya Ramanathan, Radha Kumaran, Ram Rohan R, Rajat Gupta, and Vishalakshi Prabhu, an intelligent music player based on
emotion recognition, 2nd IEEE International Conference on Computational Systems and Information Technology for Sustainable
Solutions 2017. https://doi.org/10.1109/CSITSS.2017.8447743
[2] Suresh Kallam , M K Jayanthi Kannan , B. R. M. , . (2024). A Novel Authentication Mechanism with Efficient Math Based
Approach. International Journal of Intelligent Systems and Applications in Engineering, 12(3), 2500–2510. Retrieved from
https://ijisae.org/index.php/IJISAE/article/view/5722

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 70
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
[3] Shlok Gilda, Husain Zafar, Chintan Soni, Kshitija Waghurdekar, Smart music player integrating facial emotion recognition and
music mood recommendation, Department of Computer Engineering, Pune Institute of Computer Technology, Pune, India,
(IEEE),2017. https://doi.org/10.1109/WiSPNET.2017.8299738
[4] Balajee RM, Jayanthi Kannan MK, Murali Mohan V., "Image-Based Authentication Security Improvement by Randomized
Selection Approach," in Inventive Computation and Information Technologies, Springer, Singapore, 2022, pp. 61-71
[5] Deger Ayata, Yusuf Yaslan, and Mustafa E. Kamasak, Emotion-based music recommendation system using wearable
physiological sensors, IEEE transactions on consumer electronics, vol. 14, no. 8,
2018.https://doi.org/10.1109/TCE.2018.2844736
[6] B. R M, S. Kallam and M. K. Jayanthi Kannan, "Network Intrusion Classifier with Optimized Clustering Algorithm for the
Efficient Classification," 2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile
Networks (ICICV), Tirunelveli, India, 2024, pp. 439-446, doi: 10.1109/ICICV62344.2024.00075.
[7] Ahlam Alrihail, Alaa Alsaedi, Kholood Albalawi, Liyakathunisa Syed, Music recommender system for users based on emotion
detection through facial features, Department of Computer Science Taibah University, (DeSE), 2019.
[8] M. K. Jayanthi, "Strategic Planning for Information Security -DID Mechanism to befriend the Cyber Criminals to assure Cyber
Freedom," 2017 2nd International Conference on Anti-Cyber Crimes (ICACC), Abha, Saudi Arabia, 2017, pp. 142-147, doi:
10.1109/Anti-Cybercrime.2017.7905280.
[9] Kavitha, E., Tamilarasan, R., Baladhandapani, A., Kannan, M.K.J. (2022). A novel soft clustering approach for gene expression
data. Computer Systems Science and Engineering, 43(3), 871-886. https://doi.org/10.32604/csse.2022.021215
[10] G., D. K., Singh, M. K., & Jayanthi, M. (Eds.). (2016). Network Security Attacks and Countermeasures. IGI Global.
https://doi.org/10.4018/978-1-4666-8761-5
[11] R M, B.; M K, J.K. Intrusion Detection on AWS Cloud through Hybrid Deep Learning Algorithm. Electronics 2023, 12, 1423.
https://doi.org/10.3390/electronics12061423
[12] Research Prediction Competition, Challenges in representation learning: facial expression recognition challenges, Learn facial
expression from an image, (KAGGLE).
[13] Naik, Harish and Kannan, M K Jayanthi, A Survey on Protecting Confidential Data over Distributed Storage in Cloud (December
1, 2020). Available at SSRN: https://ssrn.com/abstract=3740465 or http://dx.doi.org/10.2139/ssrn.3740465
[14] Kavitha, E., Tamilarasan, R., Poonguzhali, N., Kannan, M.K.J. (2022). Clustering gene expression data through modified
agglomerative M-CURE hierarchical algorithm. Computer Systems Science and Engineering, 41(3), 1027-141.
https://doi.org/10.32604/csse.2022.020634
[15] Kumar, K.L.S., Kannan, M.K.J. (2024). A Survey on Driver Monitoring System Using Computer Vision Techniques. In:
Hassanien, A.E., Anand, S., Jaiswal, A., Kumar, P. (eds) Innovative Computing and Communications. ICICC 2024. Lecture
Notes in Networks and Systems, vol 1021. Springer, Singapore. https://doi.org/10.1007/978-981-97-3591-4_21
[16] M. K. J. Kannan, "A bird's eye view of Cyber Crimes and Free and Open Source Software's to Detoxify Cyber Crime Attacks -
an End User Perspective," 2017 2nd International Conference on Anti-Cyber Crimes (ICACC), Abha, Saudi Arabia, 2017, pp.
232-237, doi: 10.1109/Anti-Cybercrime.2017.7905297.
[17] P. Jain, I. Rajvaidya, K. K. Sah and J. Kannan, "Machine Learning Techniques for Malware Detection- a Research Review,"
2022 IEEE International Students' Conference on Electrical, Electronics and Computer Science (SCEECS), BHOPAL, India,
2022, pp. 1-6, doi: 10.1109/SCEECS54111.2022.9740918.
[18] B. R. M, M. M. V and J. K. M. K, "Performance Analysis of Bag of Password Authentication using Python, Java and PHP
Implementation," 2021 6th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India,
2021, pp. 1032-1039, doi: 10.1109/ICCES51350.2021.9489233.
[19] Preema J.S, Rajashree, Sahana M, Savitri H, Review on facial expression-based music player, International Journal of
Engineering Research & Technology (IJERT), ISSN-2278- 0181, Volume 6, Issue 15, 2018.
[20] Dr.M.K. Jayanthi and Sree Dharinya, V., (2013), Effective Retrieval of Text and Media Learning Objects using Automatic
Annotation, World Applied Sciences Journal, Vol. 27 No.1, 2013, © IDOSI Publications,2013, DOI:
10.5829/idosi.wasj.2013.27.01.1614, pp.123-129. https://www.idosi.org/wasj/wasj27(1)13/20.pdf
[21] AYUSH Guidel, Birat Sapkota, Krishna Sapkota, Music recommendation by facial analysis, February 17, 2020.
[22] Dr. Naila Aaijaz, Dr. K. Grace Mani, Dr. M. K. Jayanthi Kannan and Dr. Veena Tewari (Feb 2025), The Future of Innovation
and Technology in Education: Trends and Opportunities, ASIN : B0DW334PR9, S&M Publications, Mangalore, Haridwar,
India-247667, ISBN-13 : 978-
8198488824,https://www.amazon.in/gp/product/B0DW334PR9/ref=ox_sc_act_title_1?smid=A2DVPTOROMUBNE&psc=1#d
etailBullets_feature_div
[23] Python for Data Analytics: Practical Techniques and Applications, Dr. Surendra Kumar Shukla, Dr. Upendra Dwivedi, Dr. M K
Jayanthi Kannan, Chalamalasetty Sarvani ISBN: 978-93-6226-727-6, ASIN : B0DMJY4X9N, JSR Publications, 23 October
2024, https://www.amazon.in/gp/product/B0DMJY4X9N/ref=ox_sc_act_title_1?smid=A29XE7SVTY6MCQ&psc=1
[24] CH. sadhvika, Gutta.Abigna, P. Srinivas reddy, Emotion-based music recommendation system, Sreenidhi Institute of Science
and Technology, Yamnampet, Hyderabad; International Journal of Emerging Technologies and Innovative Research (JETIR)
Volume 7, Is-sue 4, April 2020.
[25] Harish Naik and M K Jayanthi Kannan, A Research on Various Security Aware Mechanisms in Multi-Cloud Environment for
Improving Data Security, ISBN:979-8-3503-4745-6, DOI: 10.1109/ICDCECE57866.2023.10151135, 2nd IEEE International
Conference on Distributed Computing and Electrical Circuits and Electronics ICDCECE 2023,
https://ieeexplore.ieee.org/document/10151135

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 71
Dr. M.K. Jayanthi Kannan et. al., International Journal of Advance Research, Ideas and Innovations in Technology
(ISSN: 2454-132X)
[26] Vincent Tabora, Face detection using OpenCV with Haar Cascade Classifiers, Becominghuman.ai,2019.
[27] Harish Naik Bheemanaik Manjyanaik, Rajanikanta, Jayanthi Mangayarkarasi Kannan, Preserving Confidential Data Using
Improved Rivest-Shamir Adleman to Secure Multi-Cloud, International Journal of Intelligent Engineering and Systems, Vol.17,
No.4, 2024 pp .162-171, DOI: 10.22266/ijies2024.0831.13, https://inass.org/wp-content/uploads/2024/02/2024083113-2.pdf.
[28] Zhuwei Qin, Fuxun Yu, Chenchen Liu, Xiang Chen. How convolutional neural networks see the world - A survey of convolutional
neural network visualization methods. Mathematical Foundations of Computing, May 2018.

© 2025, IJARIIT - All rights reserved. Website: www.ijariit.com Talk to Counselor: 9056222273 Page: 72

You might also like