0% found this document useful (0 votes)
75 views

An AI Powered Voice Assistant For Enhanced User Interaction (Voice-Bot)

AI Voice Assistant can undoubtedly make it easier for people who are fewer techs knowledgeable to use their computers and do activities. It lowers the barrier for folks who might feel frightened or unfamiliar with conventional computer interfaces by offering voice- based interaction. The user experience is made more convenient and accessible by the ability to launch programs using voice commands, such as "open Notepad" or "open File Explorer”.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

An AI Powered Voice Assistant For Enhanced User Interaction (Voice-Bot)

AI Voice Assistant can undoubtedly make it easier for people who are fewer techs knowledgeable to use their computers and do activities. It lowers the barrier for folks who might feel frightened or unfamiliar with conventional computer interfaces by offering voice- based interaction. The user experience is made more convenient and accessible by the ability to launch programs using voice commands, such as "open Notepad" or "open File Explorer”.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Volume 8, Issue 6, Jun – 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

An AI Powered Voice Assistant for Enhanced User


Interaction (Voice-Bot)
Arul Natarajan1 Badam Shashank2
(Assistant Professor), Computer Science and Engineering
Computer Science and Engineering St. Peter’s Engineering College
St. Peter’s Engineering College Hyderabad, India
Hyderabad, India

Vidya Rajasekaran3 Boga Roja4


Department of Information Technology Computer Science and Engineering
B.S. Abdur Rahman Crescent Institute of Science and St. Peter’s Engineering College
Technology, Chennai, India Hyderabad, India

Amjan Shiek5 J. Ramesh Babu6


Professor and Head, Computer Science and Engineering Associate Professor, Computer Science and Engineering
St. Peter’s Engineering College St. Peter’s Engineering College
Hyderabad, India Hyderabad, India

Abstract:- AI Voice Assistant can undoubtedly make it computer by using a voice assistant. It is software for
easier for people who are fewer techs knowledgeable to Windows that includes an intelligent personal assistant, a
use their computers and do activities. It lowers the human language interface, automation, and speech
barrier for folks who might feel frightened or unfamiliar recognition.
with conventional computer interfaces by offering voice-
based interaction. The user experience is made more When IBM Shoebox was introduced in 1962, they
convenient and accessible by the ability to launch were able to launch their long history of voice assistants at
programs using voice commands, such as "open the Seattle World's Fair. The tool can be used as a basic
Notepad" or "open File Explorer”. It makes it easier to calculator and can recognize digits. Natural discourse
start particular applications without having to manually Processing (NLP) methods are used in one application area
browse through menus or desktop shortcuts. Your AI of AI voice assistants to comprehend and interpret human
voice assistant appears to offer a number of sophisticated discourse. They use cloud computing to process and analyze
functionalities in addition to standard computer voice data, enabling natural language interaction with users.
functions like resuming and shutting down. The book In general, AI voice assistants make use of NLP techniques,
recommendation can propose books depending on the cloud computing, and have developed into useful tools for
user's choices, while the movie recommender can help carrying out actions via voice commands, making them
consumers discover new films based on their effective and user-friendly in a variety of sectors.
preferences.
II. LITERATURE SURVEY
Keywords:- Deep Face, Voice Bot, Speech-to-text, and
Guidance According to [1], voice assistants are a crucial
component of our technological ecology since they allow us
I. INTRODUCTION to communicate with gadgets and carry out activities orally.
Nowadays, we all choose automatic system works over
A desktop application is AI Voice Assistant. All routine manual working processes because we require
computing apps and the vast majority of different works will everything to be completed easily and faster than predicted.
function effectively in our daily lives. Natural Language We become more efficient and get more work done thanks
Processing (NLP) is an area of AI that relates to software. to automatic processes. In order to accomplish all of our
Voice assistants can interact with users in natural language tasks automatically, we have built this assistant in that
and utilize cloud computing. Using AI components like manner. In [2] a practical and effective method for
speech recognition, Python, text to speech, speech Engine, managing chores, voice AI has become a crucial component
and Deep Face, it is a very helpful tool for completing voice of our technological landscape. Users can carry out a variety
tasks. It primarily consists of a section with of tasks utilising voice commands without the usage of
recommendations for films and literature, the final module is typical manual workflows or manual input. This automation
a sophisticated one with an emotion-based music player that improves the user experience overall, promotes productivity,
recognizes the user's facial expression and plays music in and saves time. It makes sense to extend the same procedure
line with that emotion.You can communicate with your to include book recommendations. Similar collaborative

IJISRT23JUN2047 www.ijisrt.com 1869


Volume 8, Issue 6, Jun – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
filtering techniques can be used to suggest books based on
user choices and shared reading interests [3]. The work in
[4] produces a customized musical expression that matches
the user's emotional state by fusing facial expression
analysis with music suggestion. The whole user experience
is improved by adding a layer of interactivity and
adaptability to the music listening process. This research [5]
provided us with an understanding of the process used to
play music based on a user's mood by observing their facial
expression utilizing AI and DL algorithms. In addition to the
aforementioned publications, we also found examples of
desktop applications written in Python. Python's Tkinter
library is used to create GUIs (Graphical User Interfaces).
Python makes it a popular option for developers looking to
quickly produce GUI-based applications. Fig 1 Use Case Diagram

III. PROPOSED SYSTEM When a user speaks a command into the voice bot, it
runs the query and outputs the result. The function that
Given the numerous drawbacks of the current system, voices assistants play nowadays is becoming important.
the suggested system is an AI-based application in which all
computer applications operate with the aid of voice The ability of the final system to handle any kind of
commands by using the user's speech and then react in inquiries will help save a lot of time and work. It runs
accordance with the user's order. As a result, it saves time several applications, sends emails, and shows the user the
and energy. Additionally, instead of manually typing out date and time with the help of a few modules. The voice
numerous papers, a user can utilize voice commands to assistant is currently gaining popularity; in a few years,
translate their words into text. They can use the voice everyone will be using it, and its rapid expansion will
command open [filename] to discover any files, directories, continue.
or application. When the user uses voice commands to get
information, the voice assistant describes the data and only IV. RESULTS AND DISCUSSION
displays the relevant information. The ability to process mail
using voice commands, log in using the user's voice, enter There are various modules in the application. The
the data, and send the mail is another special feature. As a "speech recognition" package that Python offers is a useful
result, the voice assistant can be used for all other programs, tool for implementing this feature. For audio recording, this
including storing files and starting them. If the user is library depends on another library. You can use the library's
prepared to put in the effort, it can also suggest books to 'Recognizer' class to get started with voice command
read and films to watch in addition to the advanced modules. recognition. This course offers techniques for identifying
The user encounters many circumstances throughout their speech from a microphone as well as other sources. The
daily lives. He or she may feel joyful, sad, afraid, surprised, 'listen' technique is used to record audio input from the user
or neutral in those circumstances. Deep face is a module that through the microphone. To ensure reliable voice
can identify a user's emotions and play music based on those recognition, you might need to modify the recording's
emotions. The creation of a talk technique for the output is energy threshold to block out background noise. The ideal
the final and most crucial module. This module includes a threshold can change based on the ambient noise level.
speak method that speaks out the output data for user Cloud-based systems like Google Speech Recognition,
commands and system-processed queries as well as for which need a live internet connection, are frequently used to
displaying the output. n Fig [1], we employ the Python Text translate speech to text. These services process and convert
to Speech (pyttsx3) module and the Microsoft Speech speech into text using powerful computers and machine
Application Programming Interface (sapi5) to turn the text learning algorithms. The module being discussed is a query
into speech on the computer. processing system's implementation process. It accepts user
commands, processes them, and generates the necessary
outputs based on various conditions. The operations you
mentioned are listed below: Speech Detection This module
is in charge of turning oral commands from the user into text
that the system can comprehend. It allows the computer to
hear the user's voice and recognizes the command.

Condition Checking: To identify the best course of


action, the module evaluates the condition depending on the
received instruction. The system moves forward to produce
the right output if the condition evaluates to true. Date and
Time: To access the current date and time, the system uses
the date-time module. As a result, the system is able to give

IJISRT23JUN2047 www.ijisrt.com 1870


Volume 8, Issue 6, Jun – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
the user the current date and time upon request. In order to
launch or open any application, such as Google, YouTube,
Notepad, etc., we use the operating system (OS) module.
The startfile() library and listdir() libraries are used to locate
the location of the necessary file and to launch any
application, respectively. Fig 4 Execution Page

In Fig 2, the speech recognition module records the


user's spoken instruction to "Open Notepad" and turns it into
text. After conversion, the text is analyzed to see if the
"Open Notepad" condition is true.

Fig 5 Response Page

Fig 2 Notepad Page

The system then generates the corresponding result,


which is to launch the Notepad application, because the
condition is true. As a result, the user can interact with the
Notepad application through the user interface.
Fig 6 Microphone Access

This feature gives the app a thoughtful and interesting


touch while providing comfort and solace in trying
circumstances.

Fig 3 YouTube Page Fig 7 Movie Recommendation

Fig. 3 is the speech recognition module in this instance Fig 7 provides accurate movie recommendations to the
records the user's spoken command to "Open YouTube" and user by integrating a thorough recommendation algorithm
converts it to text. Following conversion, the text is utilized that assesses many aspects. To produce initial
to test the condition, in this case "Open YouTube." The recommendations, it considers elements including movie
system generates the corresponding output after the reviews, release dates, and viewership. The module also
condition is satisfied, which causes the YouTube website or adjusts as the user watches more and more content, further
application to launch. honingthe recommendations to match their interests.

As shown in Fig. 4, integrating the Deep face module, V. CONCLUSION AND FUTURE WORK
the system can determine the user's emotions from their
facial expressions and make tailored music Voice assistants are likely to be used more frequently
recommendations as a result. as they develop and become more well-liked. Voice
assistants have the ability to completely change how people
engage with technology due to improvements in natural
language understanding and rising user acceptance. These
assistants are important in daily life because they can

IJISRT23JUN2047 www.ijisrt.com 1871


Volume 8, Issue 6, Jun – 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
expedite a variety of tasks, from sending messages to buying [2] Majumdar, S., Kirkley, S., & Srivastava, M. (2022).
tickets, and so save time. Voice command AI assistant for public safety. 2022
International Conference on Innovation and
However, issues including user preferences, Intelligence for Informatics, Computing, and
technological developments, privacy concerns, and the Technologies(3ICT).
creation of powerful and dependable voice recognition and https://doi.org/10.1109/3ict56508.2022.9990716
processing skills will all have an impact on the adoption and [3] O'Shaughnessy, D. (2003). Interacting with
use of voice assistants. Nevertheless, the rise in popularity computers by voice: Automatic speech recognition
of voice assistants portends a day when their use will be and synthesis. Proceedings of the IEEE, 91(9), 1272-
pervasive and provide users with major time-saving 1305. https://doi.org/10.1109/jproc.2003.817117
advantages. The primary goal of the project was to create a [4] M. Chenna Keshava, P. Narendra Reddy, S.
Desktop Assistant that would be used to find solutions to Srinivasulu, & B. Dinesh Naik. (2020). Machine
questions supplied by users. to give the user the information learning model for movie recommendation
they need in a sufficient amount. Background investigation system. International Journal of Engineering
was done, including a rundown of the discussion process Researchand, V9(04). https://doi.org/10.17577/ijertv9
and any available relevant desktop Assistants. is 040741.
[5] Emotion recognition from facial expression using
Natural words Processing is the method used by voice- deep learning. (2019). International Journal of
controlled devices to understand the human speaker's words, Engineering and Advanced Technology, 8(6S), 91-
process the question, and answer to the human with the 95. https://doi.org/10.35940/ijeat.f1019.0886s19
outcome. Understanding the gadget means that artificial [6] Kenny, P. G., & Parsons, T. D. (2011). Embodied
intelligence must be incorporated into it in order for it to conversational virtual patients. Conversational
function intelligently, operate IoT applications and devices, Agents and Natural Language Interaction, 254-
and reply to queries that search the web for answers and 281. https://doi.org/10.4018/978-1-60960-6176.ch01
process them. Even though they are bold ones, 1
technology is currently taking its first steps. However, we [7] Epstein, J., & Klinkenberg, W. (2001). From Eliza to
anticipate that personal assistants will advance quickly, internet: A brief history of computerized
providing better interactivity, better speech recognition, and assessment. Computers in Human Behavior, 17(3),
the capacity to handle more complex issues. 295-314. https://doi.org/10.1016/s0747-5632(01)000
04-8
Personal assistants will keep getting better. With each [8] Kunekar, P., Deshmukh, A., Gajalwad, S.,
request that is completed, the system gets better thanks to Bichare, A., Gunjal, K., & Hingade, S. (2023). AI-
machine learning algorithms. As a result, the speakers will based desktop voice assistant. 2023 5th Biennial
actually advance in intelligence. This implies that they International Conference on Nascent Technologies in
would be able to carry on conversations in addition to Engineering (ICNTE). https://doi.org/10.1109/icnte
simply responding to individual queries. 56631.2023.10146699
[9] Jacobsen, N. (2016). IPhone 4S: Ein gigantischer,
Additionally, the intelligent voice assistant will pick up Aber trügerischer Erfolg. Das Apple-Imperium 2.0,
on the owner's preferences and routines. It will therefore be 51-54. https://doi.org/10.1007/978-3-658-09548-
able to give a more tailored experience by anticipating 2_11
inquiries and providing more accurate search results. The [10] Han, Y. (2021). A study on communication between
biggest barrier preventing the global adoption of smart AI (Artificial intelligence) voice assistant and AI
speakers is localization. The majority are presently offered speaker users and services usage : Focusing on
in Chinese, German, or English. The gateway to larger anthropomorphism personality, social presence, and
markets will be opened by localization to other languages. personalization. Journal of Communication
Science, 21(3),225-275. https://doi.org/10.14696/
The capabilities of voice assistants are constantly jcs.2021.09.21.3.225
evolving. To enhance the capabilities of assistants, AI [11] Naureen, A., Siddiqa, A., & Devi, P. J. (2022).
businesses leverage data from current systems. In the end, Amazon product Alexa’s sentiment analysis using
according to Lucas, the voice assistant may become so machine learning algorithms. Lecture Notes in
intelligent that it will automatically order a pizza if you Networks and Systems, 543-551. https://doi.org/10.
mention being hungry. It will determine that stating that you 1007/978-981-16-8512-5_57
are hungry is equivalent to placing an order for pizza using
historical data from your prior transactions.

REFERENCES

[1] Hoy, M. B. (2018). Alexa, Siri, Cortana, and more:


An introduction to voice assistants. Medical
Reference Services Quarterly, 37(1), 81-
88. https://doi.org/10.1080/02763869.2018.1404391

IJISRT23JUN2047 www.ijisrt.com 1872

You might also like