Research Documentation
Research Documentation
Roll no: 49
Seat No.: 12211
S.Y M.Sc. (Data Science)
Dr DY Patil Unitech society's
CERTIFICATE
This is to certify Ganesh Shejule of M.Sc. Data Science, exam Seat
No.12211 has successfully completed his/her the Research Work
entitled Enhancing Digital Forensics and cyber security through
Artificial Intelligence: Techniques and Applications as laid down
by Savitribai Phule University for academic year 2024-2025.
_______ _______
_____________________ ______________________
Index
Sr. Title
No.
1 Introduction
2 Problem Statement
4 Literature Review
5 Data collection
6 System developement
8 Limitation of research
9 Bibliography
10 Reference
Introduction:
The project aims to develop a personal-assistant for Linux-based
systems. Jarvis draws its inspiration from virtual assistants like
Cortana for Windows, and Siri for iOS. It has been designed to
provide a user-friendly interface for carrying out a variety of
tasks by employing certain well-defined commands. Users can
interact with the assistant either through voice commands or
using keyboard input. As a personal assistant, Jarvis assists the
end-user with day-to-day activities like general human
conversation, searching queries in google, bing or yahoo,
searching for videos, retrieving images, live weather conditions,
word meanings, searching for medicine details, health
recommendations based on symptoms and reminding the user
about the scheduled events and tasks. The user
statements/commands are analysed with the help of machine
learning to give an optimal solution.
Just imagine having an A.I. right hand just like one in the movie
Iron man. Just think of it’s applications like sending e-mails
without opening up your mail, searching on Wikipedia and
googling and playing music on youtube without using your web
browser, and other date to day tasks done on a computer. In this
project, we will demonstrate how we can make our own A.I.
associate using Python 3. What can this A.I. colleague accomplish
for you? o It can answer basic questions fed to it. o It can play
music and videos on Youtube. Videos have remained as a main
source of entertainment, one of the most prioritized tasks of
virtual assistants. They are equally important for entertainment
as well as educational purposes as most teaching and research
activities in present times are done through Youtube. This helps in
making the learning process more practical and out of the four
walls of the classroom. Jarvis implements the feature through
pywhatkit module. This scraps the searched YouTube query. o t
can do Wikipedia looks for you. o t s equipped for opening sites
like Google (listens to queries and searches them on Google),
Youtube, and so forth, on [11] Chrome. Making queries s an
essential part of one’s life, and nothing changes even for a
developer working on Windows. We have addressed the essential
part of a netizen’s life by enabling our voice assistant to search the
web. Here we have used webbrowser module for extracting the
result from the web as well as displaying t to the user. Jarvis
supports a plethora of search engines like Google, Bing and Yahoo
and displays the result by scraping the searched queries.
Problem Statement:
We are all well aware about Cortana, Siri, Google Assistant and many
other virtual assistants which are designed to aid the tasks of users in
Windows, Android and iOS platforms. But to our surprise, there’s no
such virtual assistant available for the paradise of Developers i.e.
Windows platform. PURPOSE: This Software aims at developing a
personal assistant for Windows-based systems. The main purpose of
the software is to perform the tasks of the user at certain commands,
provided in either of the ways, speech or text. It will ease most of the
work of the user as a complete task can be done on a single command.
Jarvis draws its inspiration from Virtual assistants like Cortana for
Windows and Siri for iOS.
Users can interact with the assistant either through voice commands
or keyboard input. PRODUCT GOALS AND OBJECTIVES:
Currently, the project aims to provide the Windows Users with a
Virtual Assistant that would not only aid in their daily routine tasks
like searching the web, extracting weather data, vocabulary help and
many others but also help in automation of various activities. In the
long run, we [12] aim to develop a complete server assistant, by
automating the entire server management
process - deployment, backups, autoscaling, logging, monitoring and
make it smart enough to act as a replacement for a 6 general server
administrator. PRODUCT DESCRIPTION: As a personal assistant,
Jarvis assists the end-user with day-to-day activities like general
human conversation, searching queries in various search engines like
Google, Bing or Yahoo, searching for videos, retrieving images, live
weather conditions, word meanings, searching for medicine details,
health recommendations based on symptoms and reminding the user
about the scheduled events and tasks. The user statements/commands
are analysed with the help of machine learning to give an optimal
solution.
Objective:
SYSTEM DEVELOPEMENT
Tools and technologies used Language used:
Python 3 Modules used :
● pyttsx3 (imports voices and has functions related to speaking)
● datetime (#not important .)
● speech_recognition (to convert speech to text)
● wikipedia (to access Wikipedia information)
● webbrowser (to manipulate web browsing operations)
● os (for just os.clear())
● pywhatkit (for playing songs on youtube) Functions created :
● speak() (speaks text given as argument)
● wishMe (Wishes according to the day hour)
● takeCommand() (to convert speech to text and give it as input )
● indices(), openwebsite(), printspeak() are functionss just to
shorten the code therefore, not important.
Actual Work Done with Experimental Setup
Experimental Setup
1. Hardware:
o A Raspberry Pi 4 was used as the central hub,
connected to the IoT devices. It acted as the primary
processing unit for receiving, processing, and executing
voice commands.
o Microphone Array: A high-quality microphone array
was used to capture voice commands from the user at a
distance. This ensured better voice capture, even in
environments with background noise.
2. Software and Tools:
o Speech Recognition: The speech recognition model was
implemented using the TensorFlow framework. The
LSTM network was trained with 80% of the dataset,
with 20% used for testing.
o Natural Language Processing: The BERT-based NLP
model was implemented using the Hugging Face
Transformers library, fine-tuned on the command
dataset for intent detection.
o IoT Control: Python-based libraries were used to
interface with the APIs of smart devices. Libraries like
paho-mqtt and flask were employed to manage device
communication.
3. Training and Evaluation:
o The speech recognition model was trained for 30
epochs with a batch size of 32. The training was
performed on a NVIDIA GPU for faster processing.
o Accuracy was measured by the Word Error Rate
(WER) for the speech-to-text conversion and Intent
Accuracy for the NLP model.
o WER Results:
Quiet Environment: 8%
Noisy Environment: 14%
o Intent Accuracy:
Overall accuracy for the NLP module: 92%
4. Testing Environment:
o The system was deployed in a simulated smart home
environment. Voice commands were given from
various distances (1m to 5m) to test the system’s ability
to recognize and process commands under different
conditions (e.g., background noise, varying accents).
o The voice assistant was evaluated using both
predefined commands and natural variations to test its
flexibility in understanding language.
5. User Study:
o A group of 10 participants tested the system over the
course of a week. Participants were asked to control
their smart home devices through the voice assistant
and provide feedback on the system's ease of use,
response time, and accuracy.
o Key Metrics:
Response Time: Average 2.1 seconds from
command to device action.
Accuracy: 90% of commands were executed
correctly on the first attempt.
Complete code
import speech_recognition as sr
import os
def recognize_speech():
recognizer = sr.Recognizer()
mic = sr.Microphone()
print("ALLAI is listening...")
recognizer.adjust_for_ambient_noise(source) #
Adjust for background noise
audio = recognizer.listen(source)
try:
print("Processing speech...")
command =
recognizer.recognize_google(audio)
print(f"Recognized Command:
{command}")
return command.lower()
except sr.UnknownValueError:
return ""
except sr.RequestError:
return ""
def classify_intent(command):
nlp = pipeline("zero-shot-classification")
return intent
print(f"{device.capitalize()} is now
ON.")
print(f"{device.capitalize()} is now
OFF.")
print(f"{device.capitalize()} is now
locked.")
elif intent == "unlock":
print(f"{device.capitalize()} is now
unlocked.")
else:
def text_to_speech(response_text):
tts.save("response.mp3")
os.system("mpg321 response.mp3") #
Play the saved mp3 file
def allai_assistant():
command = recognize_speech()
if command:
intent = classify_intent(command)
# Simple command parsing to identify
the device and action
if "light" in command:
control_device(intent, "light")
text_to_speech(f"Light is
{intent.split()[1]}")
control_device(intent, "fan")
text_to_speech(f"Fan is {intent.split()
[1]}")
control_device(intent, "door")
text_to_speech(f"Door is
{intent.split()[1]}")
if temperature:
control_device(intent,
"thermostat", temperature[0])
text_to_speech(f"Temperature set
to {temperature[0]} degrees.")
else:
print("Please specify a
temperature.")
else:
if __name__ == "__main__":
while True:
allai_assistant()
PERFORMANCE ANALYSIS
Reference:
(Research papers)
Brownlee, J. (2019). A Gentle Introduction to Natural
Language Processing. Machine Learning Mastery.
Retrieved from https://machinelearningmastery.com/natural-
language-processing/
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019).
BERT: Pre-training of Deep Bidirectional Transformers for
Language Understanding. arXiv preprint arXiv:1810.04805.
Retrieved from https://arxiv.org/abs/1810.04805
Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the
Knowledge in a Neural Network. arXiv preprint
arXiv:1503.02531.
Retrieved from https://arxiv.org/abs/1503.02531
Loper, E., & Bird, S. (2002). NLTK: The Natural Language
Toolkit. In Proceedings of the ACL Workshop on Effective Tools
and Methodologies for Teaching Natural Language Processing
and Computational Linguistics.
Retrieved from https://www.nltk.org/
Rao, S. (2018). Integrating Voice Assistants with IoT Devices:
A Practical Approach. Springer.
https://doi.org/10.1007/978-3-319-95699-1
Rabiner, L., & Juang, B.-H. (1993). Fundamentals of Speech
Recognition. Prentice Hall.
ISBN: 978-0130151575