0% found this document useful (0 votes)
5 views

demo 1 assignment for college

The document presents a project report on the development of an AI Assistant, a voice-operated system designed to assist users through natural language processing, utilizing technologies such as Python, HTML, CSS, and JavaScript. The project aims to create an offline-capable assistant with features like speech recognition, text-to-speech, face authentication, and automation of tasks, enhancing user interaction and productivity. It includes a feasibility study, significance of the project, and a detailed process description for development and implementation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

demo 1 assignment for college

The document presents a project report on the development of an AI Assistant, a voice-operated system designed to assist users through natural language processing, utilizing technologies such as Python, HTML, CSS, and JavaScript. The project aims to create an offline-capable assistant with features like speech recognition, text-to-speech, face authentication, and automation of tasks, enhancing user interaction and productivity. It includes a feasibility study, significance of the project, and a detailed process description for development and implementation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

AI Assistant

Submitted by
Randheer Kumar 2212678
Md Junaid 2212624
Vanshika 2212732
Ashwani Kumar Yadav 2212544
Rahul 2212667

Under the Guidance of


Er. Amitoj Kaur

Department of Computer Applications


In Partial Fulfilment of the Requirements for
the Award of Degree of

Bachelor of Computer Applications

DEPARTMENT OF COMPUTER APPLICATION


GULZAR SCHOOL OF MANAGEMENT
LUDHIANA
JAN-JUNE, 2025

1
CANDIDATE’S DECLARATION AND CERTIFICATE
___________________________________________________________________________

We hereby certify that the work, which is being presented in this report entitled, AI
Assistant , in partial fulfilment of the requirements for the degree of Bachelor of

Computer Application, submitted in the Department of Computer Application ,

Gulzar School of Management, Khanna, Punjab; by Randheer Kumar(2212678),


Md. Junaid(2212624), Vanshika(2212732), Ashwani Kumar Yadav(2212544),
Rahul(2212667) is the authentic record of our own work carried out under the

supervision of Er. Amitoj Kaur, Department of Computer Application, Gulzar

School of Management, Khanna, Punjab.

We further declare that the matter embodied in this report has not been submitted by us
for the award of any other degree.

Candidate(s) Signature:

This is to certify that the above statement made by the candidate is correct to the best of my
knowledge and belief.

Signature of HOD Signature of Supervisor


Dr. Shahbaz Majeed Er. Amitoj Kaur

Date:28-03-2025

2
Table of Contents

S.NO Contents Page No.

1 Introduction 4

2 Literature Review & Problem Formulation 5-7

3 Objective and Scope of the Project: 8

4 Feasibility Study 9-10

5 Significance of Project 11-12

6 Tools and Technology Used 13

7 Process Description and Time Frame 14-15

8 Contribution of the Project 16

9 Resources and Limitations 17-18

10 Project Outcome 19

3
1. Introduction
The AI Assistant project is a contemporary voice-operated system aimed at helping users
accomplish various tasks through natural language processing. Created with Python,
HTML, CSS, JavaScript, Bootstrap, and the Eel model, this assistant offers an excellent
interface between the backend and frontend. Python is the principal programming language
which facilitates speech, text-to-speech, and API integration in order to do tasks such as
messaging, opening apps, and automation. HTML, CSS, JavaScript, and Bootstrap are the
technologies used in the frontend with which a well-usable, visually appealing GUI is
provided by which users may interact with the assistant in real-time.

The system utilizes the Eel framework to bridge the Python backend and the frontend,
providing a seamless interaction between the voice assistant's logic and its interface. The
assistant is able to recognize voice commands, give responses, automate tasks such as
sending messages or opening applications, and even provide extra features such as face
authentication. By incorporating the latest technologies like speech recognition, Twilio for
messaging automation, and free AI API alternatives for conversational features, this AI
assistant is both powerful and flexible. It gives users a hands-free, intuitive, and interactive
experience for controlling their digital space more effectively.

4
2. Literature Review & Problem Formulation
Voice assistants have become an essential part of modern technology, integrating artificial
intelligence (AI) to facilitate user interactions. Systems like Apple's Siri, Google Assistant,
and Amazon Alexa use advanced speech recognition and natural language processing
(NLP) to provide assistance in various tasks. This study explores the development of a
voice assistant using Python, incorporating AI features such as speech recognition, text-to
speech conversion, facial authentication, mobile automation, and a chat-based interface.

2.1 Literature Review

2.1.1 Existing Voice Assistants

Several AI-powered voice assistants exist, including:

Siri (Apple) – Uses NLP and machine learning to process commands.

Google Assistant – Integrated with Google services for automation.

Amazon Alexa – Focuses on smart home automation.


Jarvis AI (Custom Python Implementation) – Many developers have attempted to
replicate Iron Man’s J.A.R.V.I.S using Python with speech recognition and automation.

2.1.2 Voice Assistant Development Using Python


Python provides various libraries to build a voice assistant:

Speech Recognition – speech recognition, pyaudio.

Natural Language Processing (NLP) – NLTK, spaCy, transformers.

Automation – pyautogui, pywhatkit, pyttsx3.

GUI Integration – Tkinter, Eel.

5
2.1.3 Connecting Backend (Python) with Frontend (HTML, CSS,
JavaScript) Using Eel
Eel is a lightweight Python library that allows developers to build interactive web-based
Front ends while managing backend processes in Python. This makes it a suitable
choice for integrating a Python-based voice assistant with a modern UI.

2.2 Problem Formulation


Despite the advancements in voice assistants, most AI models are either cloud-based
(requiring an internet connection) or dependent on expensive APIs. The goal of this
project is to:

Develop an offline-capable, Python-based voice assistant


Integrate a user-friendly GUI with animations (Siri wave, mic button click effect)
Enhance interaction with automation features (app & website opening, WhatsApp
automation, mobile controls, face recognition)
Create a chat feature with history using a free ChatGPT API alternative Implement hot-
word detection for hands-free activation

2.3 Features of the Proposed System


Feature Description

Speech Recognition Convert voice commands into text

Convert responses into human-like


Text-to-Speech speech

Design a responsive Siri-like wave


Siri Wave Animation animation

Mic Button Animation Create a UI animation for voice input


activation

6
Windows App & Website Opening Open apps and websites using voice
commands

Hot Word Detection Activate assistant hands-free

WhatsApp Automation Send messages via WhatsApp using


pywhatkit

Chat Feature & History


Implement AI-based chat functionality
with a history log

Face Authentication & Recognition Secure access using facial recognition

Mobile Automation Make calls and send SMS using Python


scripts

7
3. Objective and Scope of the Project
3.1 Objective:
The primary objective is to develop a Voice Assistant with advanced features like:

• Speech recognition and text-to-speech conversion.

• Integration with Windows applications and automation tasks.

• Hotword detection for seamless activation.

• WhatsApp and SMS automation.

• Face authentication for personalized experiences.

• AI-based chatbot for conversation and history tracking.

• Mobile automation, including calls and messages.

3.2 Scope:
The scope includes:

• The assistant will support voice-based interactions for ease of use.

• It will have customizable features for task automation.


• The application will integrate with third-party services for extended
functionality.
• The system will be scalable to include AI-based responses in future updates.

8
4. Feasibility Study
4.1 Development Tools:
Python: The primary backend language used to create the voice assistant (Jarvis
model). Python has numerous libraries such as speech recognition for voice input,
pyttsx3 for text-to-speech, and eel for frontend-backend integration.
Eel: An open-source Python library for creating simple Electron-like desktop
applications using HTML, CSS, and JavaScript. Eel will connect the backend Python

logic with the frontend interface. Libraries for Features:

• Speech Recognition: Using speech recognition for capturing and interpreting


voice commands.

• Text-to-Speech: Using pyttsx3 to convert text responses into speech.

• OpenCV: For face authentication and face recognition functionalities.

• Selenium: For automating WhatsApp messages and web browsing.

• Twilio: For SMS sending and phone call automation.

4.2 Operational Feasibility:


Ease of Use: The system is designed to be user-friendly, with minimal setup
requirements. The voice-based interaction enables users to interact with the assistant
without the need for manual input, making it highly efficient

Voice-based Interaction: This feature eliminates the need for manual typing. The
assistant listens to user commands through a microphone, processes them using speech
recognition, and responds via text and speech synthesis.

Mic Button Animation: Click Animation: The frontend will have a microphone button
that triggers a listening state, showing animation when clicked. This provides intuitive
user interaction.

9
4.3 Functionality Features:
Windows App Open Feature: The assistant will support opening applications on the
Windows OS using Python scripts.

Website Launch Feature: The assistant can launch websites using the default web
browser. A command such as "Open Google" will open the browser and navigate to
google.com.

Chat History: User interactions (voice and text) will be logged and stored locally to
provide a history of previous chats. This allows users to view or continue past
conversations.

Face Authentication/Recognition: Using OpenCV or a similar library, the assistant can


recognize the user’s face for security purposes, providing authentication before
accessing sensitive data or certain features

4.4 Automation Features:


WhatsApp Automation: Using libraries like pywhatkit or Selenium, the assistant can
send messages to contacts on WhatsApp based on user commands.

Mobile Automation (Phone Calls, SMS): The assistant can send SMS messages or
initiate phone calls using services like Twilio or by leveraging Python libraries for
mobile automation

4.5 Security Considerations:


Face Recognition: The use of face recognition ensures that only authorized users can
access certain assistant features. This adds an extra layer of security to the system.

Hot Word Detection: Ensures that the system only listens for commands when activated
by the hot word, reducing unnecessary background listening.

10
5. Significance of Project
This project brings innovation in human-computer interaction, providing users with a
hands-free assistant for their daily tasks. Unlike traditional software interfaces, the
voice assistant improves accessibility and productivity, especially for users with
disabilities. The inclusion of AI-driven features makes it a next-generation personal
assistant.

5.1 Advancement in AI and Automation Technology:

AI Integration: The project incorporates Artificial Intelligence (AI) through voice


recognition, text-to-speech, and natural language processing (NLP). It pushes forward
the capabilities of AI in real-time communication with users, enabling more efficient
interactions between humans and machines.

Automation Features: The system can automate everyday tasks such as sending
messages, opening apps, or performing searches on the web. This is a practical
application of AI to improve user productivity and convenience.

5.2 Improved User Interaction:

Voice-Based Interaction: The project enables hands-free operation through voice


commands, making it highly useful for people with disabilities or those seeking a more
efficient, seamless interaction. This can improve accessibility and the overall user
experience, making technology more inclusive.

Natural Language Understanding: The integration of NLP (through libraries like


speech recognition and pyttsx3) allows users to interact with the assistant in a more
natural way, using everyday language rather than command-line syntax or complex
interfaces.

11
Ease of Use: The voice assistant’s user-friendly design, including features like mic
button animation, chat history, and intuitive responses, ensures a positive experience for
users regardless of their technical background.

5.3 Enhancing Productivity and Efficiency:

Task Automation: The voice assistant can perform various tasks like sending SMS,
opening apps, browsing websites, and even automating WhatsApp messages. This
allows users to perform multiple actions simultaneously without manual intervention,
increasing productivity.

Time-Saving: By automating repetitive tasks (like making phone calls, sending


messages, or searching the web), the assistant saves users valuable time that would
otherwise be spent manually performing these actions.

5.4 Future Scalability and Adaptability:


Integration with IoT: The project can easily be extended to include control over IoT
(Internet of Things) devices, such as smart speakers, security cameras, lights, and other
connected devices.

Cross-Platform Compatibility: The system can be adapted for different platforms,


including mobile apps, web browsers, and desktop applications, broadening its
applicability.

Expansion to More Complex Systems: The core features can serve as the basis for
more advanced AI systems in areas like virtual assistants, customer service bots, or even
integration into enterprise-level systems.

12
6. Tools and Technology Used
Python: The core programming language used for backend development, powering the
voice assistant's logic and automation features.

Eel: A Python library for integrating the backend with the frontend, enabling seamless
communication between Python and web-based UI (HTML, CSS, JavaScript).

HTML, CSS, JavaScript: Used for frontend development. HTML structures the user
interface, CSS handles styling, and JavaScript manages dynamic interactions (like
mic button animation).
pyttsx3 & Speech Recognition: pyttsx3 is used for text-to-speech functionality,
converting text into spoken words.

Speech Recognition captures and processes voice commands, enabling voice-based


interaction with the assistant.

OpenCV: A computer vision library used for implementing face recognition and
authentication features to secure access to the assistant.

Twilio API: Used for automating SMS and phone calls, enabling the assistant to send
messages and make calls programmatically.

SQLite: A lightweight database used to store chat history, logs, and user preferences
for persistence between sessions.

13
7. Process Description and Time Frame
7.1 Process Description
7.1.1 Which Process Model We Are Using?

We are using the Incremental Process Model for this project. It allows us to

develop the system in small modules and integrate them step by step.

7.1.2 Why Are We Using It?

The Incremental Model is suitable because it supports modular development


and easy testing of individual features like voice recognition, face authentication,
and automation tasks.

7.2 The project development followed a structured process:


Requirement Analysis & Planning (Week 1-2): Defined project objectives, features,
and system requirements. Created a detailed plan and timeline for the development
process.

Frontend & Backend Development (Week 2-5): Developed the user interface using
HTML, CSS, and JavaScript. Built the backend logic in Python, integrating with the
frontend using the Eel library.

Voice Recognition & Hotword Detection Implementation (Week 5-6): Implemented


voice recognition using speech recognition and added hotword detection for activating
the assistant with a custom trigger word (e.g., "Jarvis").

Integration of Features (Face Authentication, Chatbot, Automation) (Week 6-8):


Integrated face authentication with OpenCV for security. Developed the chatbot
functionality and automated tasks like sending SMS and phone calls using Twilio API.

Testing & Debugging (Week 8-9): Conducted thorough testing of all system
components, identified bugs, and performed debugging to ensure smooth operation.

14
Deployment & Documentation (Week 10-12): Deployed the assistant for use,
ensuring all components were functioning correctly. Created user and technical
documentation for future reference and maintenance.

15
8. Contribution of the Project

Unique Contribution of the Project to Society

The proposed voice assistant (Jarvis-like system) provides several unique


contributions to society that are not fully addressed by available AI systems today:

8.1 Integration of Accessibility and Security through Face Authentication:


While many voice assistants provide voice commands and automation, face
authentication ensures an added layer of security, particularly useful for people in
high-security environments or those who require extra privacy. Unlike typical AI
assistants, which rely solely on voice commands (which can be easily compromised),
this system integrates multi-factor authentication by combining voice recognition
with face recognition, ensuring that only authorized individuals can access sensitive
features.

8.2 Comprehensive Automation Across Multiple Platforms:


This voice assistant not only supports voice commands for desktop applications and
websites but also includes mobile automation features such as SMS sending, phone
call automation, and even WhatsApp automation. Most available AI assistants focus
only on one platform (e.g., mobile or desktop), whereas this assistant offers
crossplatform integration, seamlessly working across both devices.

8.3 Real-Time Adaptation through Chat History and Customization:


The assistant's ability to learn from past interactions by tracking chat history allows
it to adapt to the user’s specific needs and preferences over time. Unlike static AI
systems, which provide fixed responses, this assistant personalizes its interaction,
ensuring that users experience a tailored assistant that improves over time based on
continuous feedback.

16
9. Resources and Limitations

9.1 Resources:
9.1.1 Development environment
Python: The main programming language for backend logic and AI integration.
Eel: A Python library that connects the backend with a frontend using HTML,
CSS, and JavaScript. It allows you to create a web-based interface for the voice
assistant.

VS Code: The development IDE to write and manage the code, integrating both
the Python backend and frontend.

9.1.2 APIs:
Twilio: Used for automating SMS, phone calls, and WhatsApp messages.
OpenAI API Alternative: A free API alternative to OpenAI for chatbot
functionality and natural language processing.

9.1.3 Hardware:
Microphone: For capturing voice commands and enabling speech recognition.

Webcam: Used for face authentication and face recognition feature

9.2 Limitations:

9.2.1 Limited AI Response Accuracy:


Using free API alternatives might affect the accuracy and quality of responses
compared to premium options like OpenAI's GPT, limiting the overall
performance.

9.2.2 Hotword Detection:

17
Constant listening for hotwords (like "Hey Jarvis" ) might require significant
system resources. The system needs to be capable of processing real-time audio
input efficiently.

9.2.3 Dependent on Microphone Quality:


Speech recognition accuracy is heavily reliant on the quality of the microphone
used. A low-quality mic might result in misinterpretation of commands.

18
10. Project Outcome
The project delivers a fully functional Voice Assistant with an intuitive frontend and
a powerful Python-based backend. The system efficiently processes voice commands,
recognizes faces, automates applications, and interacts with users dynamically. This
assistant serves as an accessible, interactive, and smart tool for daily operations,
showcasing a blend of AI, automation, and security features.

The AI Voice Assistant project results in a fully functional voice-controlled assistant


that can carry out a variety of tasks, from opening applications and sending WhatsApp
messages to automating phone calls by voice commands. With its use of speech-totext
and text-to-speech functionality, the system provides natural, hands-free interaction,
which enables users to interact with their computers or mobile phones without a hitch.
The capacity of the assistant to automate frequent activities such as opening websites,
sending SMS, and controlling chat history helps maximize productivity and provide a
more tailored user interface. Its face authentication feature also provides an extra layer
of security, providing an effective way of authenticating users prior to exposing the
system to them.

Additionally, the project demonstrates the capability of Python, Eel, and other APIs
such as Twilio and free alternatives to OpenAI in building a flexible, user-friendly aide
that can be tailored to the user's requirements. The frontend is developed using HTML,
CSS, and JavaScript to provide seamless interaction and visual cues in the form of
microphone animations and waveforms, much like other voice aids. While there are
some constraints, for example, dependence on microphone quality and possible system
resource requirements for hotword detection, the result is a stable voice assistant
capable of development with more features and tweaks in the future.

19

You might also like