An AI Powered Voice Assistant For Enhanced User Interaction (Voice-Bot)
An AI Powered Voice Assistant For Enhanced User Interaction (Voice-Bot)
ISSN No:-2456-2165
Abstract:- AI Voice Assistant can undoubtedly make it computer by using a voice assistant. It is software for
easier for people who are fewer techs knowledgeable to Windows that includes an intelligent personal assistant, a
use their computers and do activities. It lowers the human language interface, automation, and speech
barrier for folks who might feel frightened or unfamiliar recognition.
with conventional computer interfaces by offering voice-
based interaction. The user experience is made more When IBM Shoebox was introduced in 1962, they
convenient and accessible by the ability to launch were able to launch their long history of voice assistants at
programs using voice commands, such as "open the Seattle World's Fair. The tool can be used as a basic
Notepad" or "open File Explorer”. It makes it easier to calculator and can recognize digits. Natural discourse
start particular applications without having to manually Processing (NLP) methods are used in one application area
browse through menus or desktop shortcuts. Your AI of AI voice assistants to comprehend and interpret human
voice assistant appears to offer a number of sophisticated discourse. They use cloud computing to process and analyze
functionalities in addition to standard computer voice data, enabling natural language interaction with users.
functions like resuming and shutting down. The book In general, AI voice assistants make use of NLP techniques,
recommendation can propose books depending on the cloud computing, and have developed into useful tools for
user's choices, while the movie recommender can help carrying out actions via voice commands, making them
consumers discover new films based on their effective and user-friendly in a variety of sectors.
preferences.
II. LITERATURE SURVEY
Keywords:- Deep Face, Voice Bot, Speech-to-text, and
Guidance According to [1], voice assistants are a crucial
component of our technological ecology since they allow us
I. INTRODUCTION to communicate with gadgets and carry out activities orally.
Nowadays, we all choose automatic system works over
A desktop application is AI Voice Assistant. All routine manual working processes because we require
computing apps and the vast majority of different works will everything to be completed easily and faster than predicted.
function effectively in our daily lives. Natural Language We become more efficient and get more work done thanks
Processing (NLP) is an area of AI that relates to software. to automatic processes. In order to accomplish all of our
Voice assistants can interact with users in natural language tasks automatically, we have built this assistant in that
and utilize cloud computing. Using AI components like manner. In [2] a practical and effective method for
speech recognition, Python, text to speech, speech Engine, managing chores, voice AI has become a crucial component
and Deep Face, it is a very helpful tool for completing voice of our technological landscape. Users can carry out a variety
tasks. It primarily consists of a section with of tasks utilising voice commands without the usage of
recommendations for films and literature, the final module is typical manual workflows or manual input. This automation
a sophisticated one with an emotion-based music player that improves the user experience overall, promotes productivity,
recognizes the user's facial expression and plays music in and saves time. It makes sense to extend the same procedure
line with that emotion.You can communicate with your to include book recommendations. Similar collaborative
III. PROPOSED SYSTEM When a user speaks a command into the voice bot, it
runs the query and outputs the result. The function that
Given the numerous drawbacks of the current system, voices assistants play nowadays is becoming important.
the suggested system is an AI-based application in which all
computer applications operate with the aid of voice The ability of the final system to handle any kind of
commands by using the user's speech and then react in inquiries will help save a lot of time and work. It runs
accordance with the user's order. As a result, it saves time several applications, sends emails, and shows the user the
and energy. Additionally, instead of manually typing out date and time with the help of a few modules. The voice
numerous papers, a user can utilize voice commands to assistant is currently gaining popularity; in a few years,
translate their words into text. They can use the voice everyone will be using it, and its rapid expansion will
command open [filename] to discover any files, directories, continue.
or application. When the user uses voice commands to get
information, the voice assistant describes the data and only IV. RESULTS AND DISCUSSION
displays the relevant information. The ability to process mail
using voice commands, log in using the user's voice, enter There are various modules in the application. The
the data, and send the mail is another special feature. As a "speech recognition" package that Python offers is a useful
result, the voice assistant can be used for all other programs, tool for implementing this feature. For audio recording, this
including storing files and starting them. If the user is library depends on another library. You can use the library's
prepared to put in the effort, it can also suggest books to 'Recognizer' class to get started with voice command
read and films to watch in addition to the advanced modules. recognition. This course offers techniques for identifying
The user encounters many circumstances throughout their speech from a microphone as well as other sources. The
daily lives. He or she may feel joyful, sad, afraid, surprised, 'listen' technique is used to record audio input from the user
or neutral in those circumstances. Deep face is a module that through the microphone. To ensure reliable voice
can identify a user's emotions and play music based on those recognition, you might need to modify the recording's
emotions. The creation of a talk technique for the output is energy threshold to block out background noise. The ideal
the final and most crucial module. This module includes a threshold can change based on the ambient noise level.
speak method that speaks out the output data for user Cloud-based systems like Google Speech Recognition,
commands and system-processed queries as well as for which need a live internet connection, are frequently used to
displaying the output. n Fig [1], we employ the Python Text translate speech to text. These services process and convert
to Speech (pyttsx3) module and the Microsoft Speech speech into text using powerful computers and machine
Application Programming Interface (sapi5) to turn the text learning algorithms. The module being discussed is a query
into speech on the computer. processing system's implementation process. It accepts user
commands, processes them, and generates the necessary
outputs based on various conditions. The operations you
mentioned are listed below: Speech Detection This module
is in charge of turning oral commands from the user into text
that the system can comprehend. It allows the computer to
hear the user's voice and recognizes the command.
Fig. 3 is the speech recognition module in this instance Fig 7 provides accurate movie recommendations to the
records the user's spoken command to "Open YouTube" and user by integrating a thorough recommendation algorithm
converts it to text. Following conversion, the text is utilized that assesses many aspects. To produce initial
to test the condition, in this case "Open YouTube." The recommendations, it considers elements including movie
system generates the corresponding output after the reviews, release dates, and viewership. The module also
condition is satisfied, which causes the YouTube website or adjusts as the user watches more and more content, further
application to launch. honingthe recommendations to match their interests.
As shown in Fig. 4, integrating the Deep face module, V. CONCLUSION AND FUTURE WORK
the system can determine the user's emotions from their
facial expressions and make tailored music Voice assistants are likely to be used more frequently
recommendations as a result. as they develop and become more well-liked. Voice
assistants have the ability to completely change how people
engage with technology due to improvements in natural
language understanding and rising user acceptance. These
assistants are important in daily life because they can
REFERENCES