0% found this document useful (0 votes)
53 views

Project Report Sample 1

This document outlines a term project for an electrical and computer engineering course. It includes sections for acknowledging assistance received, an abstract summarizing the project, and a table of contents. The project involves designing a human speech responsive robot capable of transporting objects from one place to another based on verbal commands. Both Google speech recognition and custom speech processing software are utilized. Objects are categorized by required gripping force. An Android app communicates with the robot via Bluetooth to control it. The robot locates sound sources using the inverse square law of sound propagation.

Uploaded by

Faria Sharif
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Project Report Sample 1

This document outlines a term project for an electrical and computer engineering course. It includes sections for acknowledging assistance received, an abstract summarizing the project, and a table of contents. The project involves designing a human speech responsive robot capable of transporting objects from one place to another based on verbal commands. Both Google speech recognition and custom speech processing software are utilized. Objects are categorized by required gripping force. An Android app communicates with the robot via Bluetooth to control it. The robot locates sound sources using the inverse square law of sound propagation.

Uploaded by

Faria Sharif
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 68

Department of Electrical and Computer Engineering

North South University

Term Project

Enter your project topic here

Enter Name ID # Enter ID

Faculty:

Zunayeed Bin Zahir

Lecturer

ECE Department

Spring, 2021

Acknowledgement
In this section write 2-3 lines acknowledging the individuals or websites or organizations who have

helped you building your project, provided you with data or any kind of required information (if

any), any kind of academic support.

For example –

First of all, we would like to express our profound gratitude to our honorable course instructor, Dr.

Hasan Uz Zaman, for his constant and meticulous supervision, valuable suggestions, his patience

and encouragement to complete the thesis work. We would also like to thank the ECE department

of North South University for providing us with the opportunity to have an industrial level design

experience as part of our curriculum for the undergraduate program. Finally, we would like to thank

our families and everybody who supported us and provided with guidance for the completion of this

project.

II
Abstract
In this section write the summary of your project i.e. why are you doing this project, what will be

the probable application of this project, what kind of engineering problem are you solving with your

project (e.g. If you build a filter, that will help you reducing the noise, getting read of the unwanted

signals and so on. P.S. Do not use my example exactly if your project is a filter design), what

motivated you to select your topic at the first place.

For example -

In this report we present a robot with twelve degree of freedom which has the capability of

transporting limited sized objects from one place to another with pick-up and drop capabilities. The

robot’s responses are based on speech recognition of verbal commands. In our project we have used

Google speech recognition module, as well as our own speech processing software, in order to

understand verbal commands. Then we compared the results of the two. We also categorized the

objects into six specific categories according to the amount of gripping force required to lift the

objects. The categories are divided according to object stiffness. An android application was used to

communicate with the robot through Bluetooth communication. The application decodes the human

speech into an array of characters which are transmitted to the robot using Bluetooth technology.

The robot uses microcontroller which decodes the messages into executable functions. In this

project we have designed the robot such that it understands only fifteen distinct verbal commands

and ignores others. We also created a device which can locate the source of the sound. The key idea

is to locate sound source in 3 dimensional system using Inverse square law of sound propagation.

We used inverse square law of sound propagation for distance calculation through amplitude

measuring from three microphones. We calculated the frequency of the sound using Digital Signal

Processing and auto correlation. This frequency was applied in Inverse square law of sound

propagation to find the distance of the source from a specific microphone. Being a speech

III
responsive mobile robot it can be effectively used to move objects from one place to another by

people with disability or handicap.

Table of Content [use the page number according to your project]


Chapter 1:
Project Overview.................................................................................................................................1
1.1 Introduction................................................................................................................................2
1.2 Background ...............................................................................................................................2
1.3 Our proposed project..................................................................................................................5
1.3.1 Description of the idea:.......................................................................................................5
1.3.2 Difficulty.............................................................................................................................6
1.4 Motivation..................................................................................................................................6
1.5 Summary....................................................................................................................................7
Chapter 2: Related work.......................................................................................................................8
Chapter 3: Theory...............................................................................................................................15
3.1 Introduction..............................................................................................................................16
3.2 Details.......................................................................................................................................16
…….You can add any item according to your project
…….
Chapter 4: Structure of the system.....................................................................................................24
4.1 Introduction..............................................................................................................................25
4.2 Procedure and Functionality.....................................................................................................25
4.2.1 Procedure...........................................................................................................................25
4.2.2 Functions...........................................................................................................................26
4.3 Workflow..................................................................................................................................29
4.3.1 Outline of the workflow....................................................................................................29
4.4 Equipment and Circuit Diagrams.............................................................................................33
Chapter 5: Modules used in this system.............................................................................................46
7.1 List and description of the hardware used
7.2 List and description of the software used.................................................................................47
Chapter 6: Results and Discussion.....................................................................................................68
11.1 Simulation..............................................................................................................................69
11.2 Results and findings...............................................................................................................69
11.3 Analysis and explanation of your result and findings............................................................71
Chapter 12: Conclusion......................................................................................................................75
Bibliography [List of references].......................................................................................................77

IV
Appendices.........................................................................................................................................82
Appendix A: Copy your entire code (if any)…………………………………………………….83

V
List of Figures (if any)

Fig No. Figure Caption Page


No.

VI
List of Tables (if any)

Table Table Caption Page


No. No.

VII
Chapter 1
Project Overview

1
1.1 Introduction
A speech responsive robot is capable of responding to human voice. Human speech is challenging to

interpret as it requires both speech processing and artificial intelligence and is a continuous process

through continuous learning. Google speech recognition is one of the finest speech recognition

software available in the market which is also open source. Almost all internet users have used

Google speech recognition system. Most of the uses of Google Speech recognition is to search online

through voice. But very few have used Google Speech Recognition API for mechanical control,

robot control etc. In our paper we present a human speech responsive robot with multiple

functionality. We used both Google Speech Recognition module and developed our own software for

speech recognition, and compared the results of both. We used an android application to

communicate with the robot. We also categorized the force required to lift the objects into specific

six categories. The robot has the capability to transport objects from one place to another with the

help of a gripper. The robot can pick and drop limited sized objects and move according to speech

command. The android application processes the speech and sends the message to the

microcontroller which is processed to executable functions. The robot is also capable of locating the

source of the sound.

1.2 Speech processing and speech recognition basics


The study of speech signals and processing of these signals in a digital representation usually using

digital signal processing is called speech processing. Aspects of speech processing includes the

acquisition, manipulation, storage, transfer and output of speech signals. The input is called speech

recognition and the output is called speech synthesis.

Speech recognition involves techniques for recognizing and translating spoken language to text. It is

the ability of a machine or program to identify words and phrases in spoken language and convert

them to a machine-readable format. Rudimentary speech recognition software has a limited

2
vocabulary of words and phrases, and it may only identify these if they are spoken very clearly.

More sophisticated software has the ability to accept natural speech.

1.2.1 How speech recognition works


Speech recognition works using algorithms through acoustic and language modeling. Acoustic

modeling represents the relationship between linguistic units of speech and audio signals; language

modeling matches sounds with word sequences to help distinguish between words that sound similar.

Often, hidden Markov models are used as well to recognize temporal patterns in speech to improve

accuracy within the system.

1.2.2 Applications of speech recognition


Voice recognition is commonly used to operate a device, perform commands, or write without

having to use a keyboard, mouse, or press any buttons. Today, automatic speech recognition

programs are used in many industries, including Healthcare, Military (e.g. F-16 fighter jets),

Telecommunications, and personal computing (i.e. hands-free computing). The most frequent

applications of speech recognition within the enterprise include call routing, speech-to-text

processing, voice dialing and voice search.

1.2.3 Pros and Cons


While convenient, speech recognition technology still has a few issues to work through, as it is

continuously developed. The pros of speech recognition software are it is easy to use and readily

available. Speech recognition software is now frequently installed in computers and mobile devices,

allowing for easy access.

The downside of speech recognition includes its inability to capture words due to variations of

pronunciation, its lack of support for most languages outside of English and its inability to sort

through background noise. These factors can lead to inaccuracies.

3
1.2.4 Performance

Speech recognition performance is measured by accuracy and speed. Accuracy is measured with

word error rate. WER works at the word level and identifies inaccuracies in transcription, although

it cannot identify how the error occurred. Speed is measured with the real-time factor. A variety of

factors can affect computer speech recognition performance, including pronunciation, accent, pitch,

volume and background noise.

1.2.5 List of popular speech recognition software

Notable open source speech recognition software engines:

 CMU Sphinx

 Julius

 Kaldi

Open-source applications that provide convenient user interfaces for the above:

 Simon

 Jasper project

Cross-platform web apps based on Chrome:

 Voice Notebook

 SpeechTexter

 Speechnotes

 Trint

4
Many cell-phone handsets have basic dial-by-voice features built in. Smartphones such as iPhones

and BlackBerrys also support this. A number of third-party apps have implemented natural-

language speech-recognition support, including:

 Indigo: Virtual assistant for android, iOS, and WP, by Artificial Solutions

 S-voice: Samsung Galaxy's Voice based personal assistant

 Google Now: Android voice search

 Microsoft Cortana: Microsoft voice search

 Siri Personal Assistant: Apple’s virtual personal assistant

Windows also has built in speech recognition software such as Cortana which was mentioned

above, Windows Speech Recognition, other add-ons and third-party apps for voice recognition in

Windows 7/8/10 such as Briana, Dragon NaturallySpeaking, SpeechMagic and so on.

1.3 Our proposed project


The main idea of our project was to build a Speech Responsive Robot with Real-Time Speech

Processing and sound source localization.

1.3.1 Description of the idea:


The robot we have built responds to human speech. There are very few robots existing in the world

at the moment, which can understand natural human speech and act accordingly. There are some

robots that are remote controlled, however, a robot with voice control is very rare. The robot we

built acts according to the command given by the human with speech, and it responds in real-time.

Capability of the Robot:

The robot has the capability to --

 Pick up and drop objects

 Understand and process human speech in real-time

5
 Locate the source of the sound, i.e. locate where the speaker is

1.3.2 Difficulty
The level of difficulty of this project was very high, as speech recognition alone is a huge project,

and understanding the speech and to act accordingly in real-time is not easy. We were not planning

to use any existing speech processing software, since none offered sufficient accuracy. Rather we

planned to build our own software, which we eventually did. First we had to develop the software

for the speech processing, and then we built the hardware components. We also had to develop the

theory of how we could detect the location of the speaker using three microphones. We had built the

robot from scratch by ourselves, and we also designed and built a unique gripper for the robot for

picking up and dropping objects. Finally, we had to synchronize both the hardware and software

parts in order to build the robot. We also implemented the system using Google Speech Recognition

API and compared accuracy of it with that of our own software.

1.4 Motivation

Technology has always been developed very rapidly by and for the able people. Upon close

speculation, we can see that most of the modern day innovations are for the able bodied, and the

physically disabled people have always been on the sidelines. Our objective was to design, develop

and build a robot that would provide a useful and efficient means of doing daily work for the

disabled people with ease. It can be used for those unable to move, to give voice commands so that

the robot can do the work, carry limited sized objects, and help the person.

Also, a robot that can help us do our work easily without moving our hands would be industrially

profitable. It can even aid people in difficult and unsafe situations such as rescue operations in

disaster zones, and it can be used to diffuse bombs or to help doctors as a third hand in medical

procedures.

6
Development in this field can open up boundless possibilities and a new era in robotics. It can result

in many new applications that can be very useful and have a great impact on the lives of people.

1.5 Summary
In this chapter, we have briefly described the basics of speech recognition, existing speech

recognition software, and the main idea on which our project was built. We have described the

capabilities of the robot, what motivated us to design and build this system, and our

accomplishments in here. The following chapters describe the theory and details of the components

used, the mechanical description, designs, and the overall structure of the system.

7
Chapter 2
Related work

8
2.1 Introduction
The existing work related to mobile robot with speech recognition, and sound source localization

that we had discovered and found useful, are described in this chapter. As we searched for similar

systems, we found that there are very few existing systems where both audio processing and robot

implementation are synchronized together. We went through some similar projects which gave us

some insights. We also searched for systems and existing papers to localize the sound source, in

order for the robot to locate the source of the verbal command. However, most of the papers we

found described locating the direction of the sound source, but did not give the distance.

2.2 Existing work related to Speech Recognition on robot

Studies based on speech recognition on robot control were very few. One of the studies (Robot-by-

voice: Experiments on commanding an industrial robot using the human voice) used Microsoft

speech to conduct the robotic function [1]. They used Microsoft speech to conduct speech

communication with the robot. The robot was stationary and was solely based on a single mechanical

hand which has the ability to move, grip, pick-up and drop objects in specific area near the

neighborhood of the mechanical hand.

Voice Automated Mobile Robot gives us the idea of voice automated mobile robot [2]. Their

research was based on the idea of communication of robot through voice. Most of the paper was

based on theory which gives us general idea of speech recognition of machine but the paper hardly

shows practical applications.

Some researches other than robot control was also conducted based on speech recognition. Design

of a Voice controlled Smart Wheelchair used CMU Sphinx to decode the human speech and used

Google glass for communication [3]. Their technology was focused on communication with the

wheel chair. It used CMU sphinx for speech processing. They achieved partial success in building

speech recognition software.

9
2.2.1 Robot-by-voice: Experiments on commanding an
industrial robot using the human voice
This paper reports a few results of an ongoing research project that aims to explore ways to

command an industrial robot using the human voice. A demonstration is presented using two

industrial robots and a personal computer (PC) equipped with a sound board and a headset

microphone. The demonstration was coded using the Microsoft Visual Basic and C#.NET 2003

and associated with two simple robot applications: one capable of picking ‐and ‐placing objects

and going to predefined positions, and the other capable of performing a simple linear weld on

a work‐piece. The speech recognition grammar is specified using the grammar builder from the

Microsoft Speech SDK 5.1. The paper also introduces the concepts of text ‐to ‐speech

translation and voice recognition, and shows how these features can be used with applications

built using the Microsoft.NET framework.

2.2.2 Voice Automated Mobile Robot


This paper elucidates the research and implementation of voice automated mobile robot. The robot

is controlled through connected speech input. The language input allows a user to interact with the

robot which is familiar to most of the people. The advantages of speech activated robots are hands-

free and fast data input operations. In future, it is expected that speech recognition systems will be

used as man-machine interface for robots in rehabilitation, entertainment etc. In view of this,

aforementioned system is a source of learning process for a mobile robot which takes speech input

as commands and performs some navigation task through a distinct man-machine interaction with

the application of the learning. The speech recognition system is trained in such a way that it

recognizes defined commands and the designed robot navigates based on the instruction through the

Speech Commands. The medium of interaction between humans and computers is on the processing

of speech (words uttered by the person). . The complete system consists of three sub-systems, the

speech recognition system, a central controller and the robot .we have studied the various factors

10
such as noise which interferes speech recognition and distance factor. The results prove that

proposed robot is capable of understanding the meaning of speech commands.

2.2.3 Design of a Voice controlled Smart Wheelchair


This paper describes the design of a smart, motorized, voice controlled wheelchair using embedded

system. Proposed design supports voice activation system for physically disabled persons

incorporating manual operation. Arduino microcontroller and speaker dependent voice recognition

processor have been used to support the navigation of the wheel chair. The direction and velocity of

the chair are controlled by pre-defined Arabic voice commands. The speaker dependent, isolated

word recognition system (IWRS) for a definite utterance of Arabic words to suit the patient's

requirements has been programmed and successfully demonstrated. The technique of speech signal

processing for extraction of sound parameters, noise removal, intensity and normalization of time ,

and features matching etc. have been done with the speech processor HM2007that being embedded

efficiently in real time. Arduino receives the coded digital signals from the IWRS which being

properly recognizes voice commands in order to control the function of the chair accordingly. The

wheelchair does not respond to a false speech command. The overall mechanical assembly is driven

by two14A/24V/200Watt DC motors having an engagement/disengagement clutch and speed

reduction gear with built-in locking control. The system is tested using a speech password to start

operation and seven Arabic commands to control motion: "Amam (forward), Saree'(fast), Batee'

(slow), Khalf (backward), Yameen (right), Yesar (left), Tawaqaf (stop)". It proved a good working

in clear and noisy environments with safe movement.

2.3 Research and publications on Sound Source Localization


There were many ways of localizing the source. Many researches took place to locate the position

of a source through visual ways using triangulation and other positioning techniques but using a

sound to locate the source is still at its prime.

Some researches were conducted to locate the source of sound. One of the significant research was

done using time delay of arrival (TODA) technique [1]. This paper (Robust Sound Source
11
Localization Using a Microphone Array on a Mobile Robot) calculated the arrival time of sound

and gives an accuracy of 3 degree. However it has a limitation of 3 meter i.e. it can locate sound

source which are within 3 meters only.

Using Microphone Arrays positioned in a special orientation was used to locate the sound source [3],

in Localization Estimation of Sound Source by Microphones Array. But limitation was that it

does not locate the source position respect to the microphone arrays. This paper gave us an

estimation of direction of sound source but did not provide us any data on distance.

Another research (Real-Time Sound Source Localization on an Embedded GPU Using a

Spherical Microphone Array.) used spherical microphone and Graphics Processing Unit (GPU)

and SRP-PHAT algorithm for sound source localization [2]. This paper provides us a better

estimation of sound source compare to the previous ones but this process is very expensive as

multiple array of microphones were used to create the spherical microphone, it requires GPU to do

the estimation and it has a limitation of localizing the position of the sound source. This paper has

very high accuracy of localizing the sound source direction but has limitation in localizing distance.

Using blind source estimation to estimate the sound source was also conducted in a research [4]. This

paper was titled Real-time multiple sound source localization using a circular microphone array

based on single-source confidence measures. It also has some limitations. It only focuses on

direction and uses circular arrays but does not locate the distance.

2.3.1 Robust Sound Source Localization Using a Microphone


Array on a Mobile Robot
In this paper a robust sound source localization method in three-dimensional using an array of 8

microphones was presented. The method is based on time delay of arrival estimation. Results show

that a mobile robot can localize in real time different types of sound sources over a range of 3 meters

and with a precision of 3 degrees.

12
2.3.2 Localization Estimation of Sound Source by
Microphones Array
In this paper, we studied a localization estimation of sound source angle and distance by plane

microphones array. We place the microphones on the peaks of equilateral triangle and square,

estimate sound source angle and distance that from source to microphone according to different

delays that from source to each microphones. We research an orientation segmentation method by

analyzing the delay characteristics and a quick estimation algorithm to reduce the computational

complexity. We introduce a quasi-L1-autocorrelation algorithm and an interpolation algorithm for

improving estimation accuracy. The system can be used for counter-terrorism, etc. This paper is

discussed theoretically and verified with the new method with experimental data.

2.3.3 Real-Time Sound Source Localization on an Embedded


GPU Using a Spherical Microphone Array
Spherical microphone arrays are becoming increasingly important in acoustic signal processing

systems for their applications in sound field analysis, beamforming, spatial audio, etc. The

positioning of target and interfering sound sources is a crucial step in many of the above

applications. Therefore, 3D sound source localization is a highly relevant topic in the acoustic signal

processing field. However, spherical microphone arrays are usually composed of many microphones

and running signal processing localization methods in real time is an important issue. Some works

have already shown the potential of Graphic Processing Units (GPUs) for developing high-end real-

time signal processing systems. New embedded systems with integrated GPU accelerators providing

low power consumption are becoming increasingly relevant. These novel systems play a very

important role in the new era of smartphones and tablets, opening further possibilities to the design

of high-performance compact processing systems. This paper presents a 3D source localization

system using a spherical microphone array fully implemented on an embedded GPU. The real-time

13
capabilities of these platforms are analyzed, providing also a performance analysis of the localization

system under different acoustic conditions.

2.3.4 Real-time multiple sound source localization using a


circular microphone array based on single-source confidence
measures
This paper proposes a novel real-time adaptive localization approach for multiple sources using a

circular array, in order to suppress the localization ambiguities faced with linear arrays, and

assuming a weak sound source sparsity which is derived from blind source separation methods. The

proposed method performs very well both in simulations and in real conditions at 50% real-time.

2.4 Summary
The existing work related to speech recognition on mobile robot and sound source detection that we

found useful have been briefly described in this section. The next chapter elaborates more on the

theoretical part of our project.

14
Chapter 3
Theory

15
3.1 Introduction
The details of the theory of our system are discussed in this chapter. The theoretical explanation is

divided into two sections

1. Speech recognition

2. Sound source localization

3.2 Speech Recognition


To convert speech to on-screen text or a computer command, a computer has to go through several

complex steps. When you speak, you create vibrations in the air. The analog-to-digital converter

(ADC) translates this analog wave into digital data that the computer can understand. To do this,

it samples, or digitizes, the sound by taking precise measurements of the wave at frequent intervals.

The system filters the digitized sound to remove unwanted noise, and sometimes to separate it into

different bands of frequency (frequency is the wavelength of the sound waves, heard by humans as

differences in pitch). It also normalizes the sound, or adjusts it to a constant volume level. It may

also have to be temporally aligned. People don't always speak at the same speed, so the sound must

be adjusted to match the speed of the template sound samples already stored in the system's

memory.

Next the signal is divided into small segments as short as a few hundredths of a second, or even

thousandths in the case of plosive consonant sounds -- consonant stops produced by obstructing

airflow in the vocal tract – like "p" or "t." The program then matches these segments to

known phonemes in the appropriate language. A phoneme is the smallest element of a language – a

representation of the sounds we make and put together to form meaningful expressions. There are

roughly 40 phonemes in the English language (different linguists have different opinions on the

exact number), while other languages have more or fewer phonemes.

The next step seems simple, but it is actually the most difficult to accomplish and is the focus of

most speech recognition research. The program examines phonemes in the context of the other

phonemes around them. It runs the contextual phoneme plot through a complex statistical model
16
and compares them to a large library of known words, phrases and sentences. The program then

determines what the user was probably saying and either outputs it as text or issues a computer

command.

Early speech recognition systems tried to apply a set of grammatical and syntactical rules to speech.

If the words spoken fit into a certain set of rules, the program could determine what the words were.

However, human language has numerous exceptions to its own rules, even when it's spoken

consistently. Accents, dialects and mannerisms can vastly change the way certain words or phrases

are spoken. Imagine someone from Boston saying the word "barn." He wouldn't pronounce the "r"

at all, and the word comes out rhyming with "John." Or consider the sentence, "I'm going to see the

ocean." Most people don't enunciate their words very carefully. The result might come out as "I'm

goin' da see tha ocean." They run several of the words together with no noticeable break, such as

"I'm goin'" and "the ocean." Rules-based systems were unsuccessful because they couldn't handle

these variations. This also explains why earlier systems could not handle continuous speech – you

had to speak each word separately, with a brief pause in between them.

Today's speech recognition systems use powerful and complicated statistical modeling systems.

These systems use probability and mathematical functions to determine the most likely outcome.

According to John Garofolo, Speech Group Manager at the Information Technology Laboratory of

the National Institute of Standards and Technology, the two models that dominate the field today

are the Hidden Markov Model and neural networks. These methods involve complex mathematical

functions, but essentially, they take the information known to the system to figure out the

information hidden from it.

The Hidden Markov Model is the most common, so we'll take a closer look at that process. In this

model, each phoneme is like a link in a chain, and the completed chain is a word. However, the

chain branches off in different directions as the program attempts to match the digital sound with

17
the phoneme that's most likely to come next. During this process, the program assigns a probability

score to each phoneme, based on its built-in dictionary and user training.

This process is even more complicated for phrases and sentences – the system has to figure out

where each word stops and starts. The classic example is the phrase "recognize speech," which

sounds a lot like "wreck a nice beach" when you say it very quickly. The program has to analyze the

phonemes using the phrase that came before it in order to get it right.

If a program has a vocabulary of 60,000 words (common in today's programs), a sequence of three

words could be any of 216 trillion possibilities. Obviously, even the most powerful computer can't

search through all of them without some help.

That help comes in the form of program training.

These statistical systems need lots of exemplary training data to reach their optimal performance –

sometimes on the order of thousands of hours of human-transcribed speech and hundreds of

megabytes of text. These training data are used to create acoustic models of words, word lists, and

multi-word probability networks. There is some art into how one selects, compiles and prepares this

training data for "digestion" by the system and how the system models are "tuned" to a particular

application. These details can make the difference between a well-performing system and a poorly-

performing system – even when using the same basic algorithm.

18
Chapter 4
Structure of the system

19
4.1 Introduction
The structure of how the system works is discussed in this chapter. We have designed and built a

robot with the capability to pick-up and drop limited sized objects applying force according to the

object’s stiffness. The robot is capable of responding to 15 distinct verbal commands. In our project

we have used Google speech recognition module in order to understand verbal commands. We also

categorized the objects into six specific categories according to the amount of gripping force

required to lift the objects. An android application was used to communicate with the robot through

Bluetooth communication. The application decodes the human speech into an array of characters

which are transmitted to the robot using Bluetooth technology. The robot uses microcontroller

which decodes the messages into executable functions.

4.2 Procedure and Functionality


Before describing the workflow and algorithms of the system, how the system works has been

explained first. For the sake of clarity, we have also explained the fifteen verbal commands the

robot recognizes and the corresponding functions it performs.

4.2.1 Procedure
First, human speech command is picked up through an android application that uses Google speech

recognition to convert the verbal command into an array of characters. This array of characters is

sent to the microcontroller on the robot using a Bluetooth module. The microcontroller receives the

data, and processes it into an executable function. There are 15 functions that we have already

defined for the robot to execute, such as moving forward or backward, picking up and dropping

objects with only the force required, and so on. If the verbal command matches one of these

predefined functions, that function is performed by the robot. If it does not match any of the

functions preprogrammed, the robot does not perform any action.

20
4.2.2 Functions
The 15 functions that the robot is capable of performing are as follows.

1. Move forward
The robot will start moving forward from its current position when it receives this command. It

will keep on moving until it is at a distance of 12 cm from any object or obstacle, which is within

its range of distance needed for it to pick up any object. The robot will continuously check the

distance in front of it using the ultrasonic sensor placed in front of the robot body.

2. Move backward
The robot will move backwards a distance of 10 cm every time it receives this command. This

will allow it to move forward and come within its pick-up zone for being able to hold an object if

needed.

3. Move left
The robot will move left by an angle of 15° every time it receives the command to “move left”.

It can then be commanded to “move forward” to move in that direction.

4. Move right
The robot will move right by an angle of 15° every time it receives the command to “move

right”, and then can be commanded to move in that direction through the “move forward”

command.

5. Stop
Whenever the robot receives the verbal command “stop”, it will come to a halt from a state of

motion.

6. Move forward X centimeters


The robot can be commanded to move forward by a certain distance using this command. Upon

receiving it, the robot computes the distance it needs to move, and will then move. The robot has

an ultrasonic sensor attached in front of the body using which it can determine whether there are

any obstacles in front of it or not and also the distance between the obstacles and itself. When the

robot is in steady state it measures the distance in front of it. When the command is given to
21
move forward X centimeters it subtracts the X centimeters from the distance of the obstacle

received by the ultrasonic sensor and traverses forward until that distance is achieved.

7. Pick up the object


Upon receiving this command, the robot reaches out for an object that is 12 cm in front of it,

grabs it, and picks it up. The reason for object to be at 12 cm front is the gripper has a grabbing

range of 8 cm to 15 cm in front of the robot. When this grabbing object is given the robot will

lower the arm using a dc motor for a specific time and when the arm is sufficiently lower the

gripper will grab the object with another motor and arm motor will again move in different

direction to move the object upward so that the path between sonar and obstacles remain open.

To clarify details of operations the mechanical diagram are elaborately discussed in section V.

8. Drop the object


When this command is given, the robot drops any object that it is holding. In order to drop the

object the robot will lower the arm using the arm motor and gripper motor will release the object

and the arm motor will again pull the arm up. After that the robot will go backward 10

centimeters. The required distance to traverse will be calculated using the ultrasonic sensor from

the object. After that the robot will remain steady and wait for the next command.

The following seven functions allow the robot to categorize the weight of the object it needs to pick

up, to determine the force it needs to exert to hold the object and lift it. There are six categories

which we have defined for classifying the weight of the object. The maximum weight the robot is

capable of lifting is 1 kg, and this is defined as “very very heavy”, which is the first category of the

object’s weight. The last category is “very very light” which is for objects within the range of 0 g to

170 g. The verbal command will have to specify the category of the object’s weight.

22
9. Object is very very light
When this command is given, the robot will consider the weight of the object to be in the “very

very light” category, which is for objects of weight within the range of 0 g to 170 g. When the

robot is commanded to pick up an object afterwards, it will exert only the force required to pick

up an object of this weight, and will not exert any more force, so that the object is not crushed or

deformed.

10. Object is very light


This command will let the robot know that the object it is to pick up is of a weight within the

range of 171 g to 340 g. It will then exert the necessary force to pick up an object of this weight

when commanded to do so.

11. Object is light


This command will let the robot know that the weight of the object it is supposed to pick up is

within the range of 341 g to 510 g.

12. Object is heavy


This command is for letting the robot set the object category for a weight within 511 g to 680 g,

so that it exerts a force appropriate to pick up an object within this range of weight when it is

commanded to do so, and does not exert a force any more or less than that.

13. Object is very heavy


This command is for the robot to set the object category for a weight between 681 g to 850 g.

14. Object is very very heavy


When the robot receives this command, it sets the object category for a weight within 851 g and

1 kg.

15. Object is X kg
This command tells the robot the weight of the object, so that it can decide which category of

weight the object belongs to, so that it exerts the appropriate force when it is commanded to grab

the object and pick it up.

23
4.3 Workflow and Algorithms
This section will describe the sequence of events that result in a verbal command being processed

and a corresponding function being executed by the robot. This section will describe the sequence

of events that result in a verbal command being processed and a corresponding function being

executed by the robot.

4.3.1 Outline of the workflow


The system consists of two parts – an android application and a mobile robot. The android

application uses Google speech recognition module to convert the human speech command into an

array of characters and sends the processed data to the microcontroller on the robot, via Bluetooth.

The microcontroller processes the data into an executable function and then the robot executes the

command. This action is explained through the flow diagram in Fig. 4.1.

Human Speech

Android application processes the


speech using Google speech recognition
module

Google speech recognition converts


speech into array of characters

Android app sends character array to


microcontroller using Bluetooth

Microcontroller

Fig. 4.1. Workflow diagram of the Android


part of the robot

24
4.3.2 Explanation of the algorithm for command execution

Fig. 4.2. Workflow diagram of the Algorithm of command execution in the


microcontroller

25
Fig. 4.2 explains the algorithm of the robot’s execution of verbal commands. After the

microcontroller accepts the character array, it checks whether it is an acceptable string of command

or not. If not, it is checked whether or not all previous commands have been finished. If there are no

commands remaining to be executed, the robot is stopped and all commands are set as finished.

However if there are any commands remaining, the running commands are executed.

If it is an acceptable command, the program checks whether the command is “stop” or not. If yes,

the robot’s motion is stopped, and all commands are set as finished. The system then goes back to

checking for a new array of characters.

If the command that is being processed is not “stop” and belongs to one of the other 14 predefined

commands, it is selected and executed. After completion of execution of this command, the robot is

stopped and all commands is set as finished. Then the program again starts checking for another

array of characters.

When the execution of running commands is being finished, if the command is either “move

forward” or “move forward X cm”, a check is made for whether there are any obstacles within 12 cm

of the robot’s front. If an obstacle is detected, the robot is stopped, and all commands set as finished.

Then the program again starts checking for another array of characters.

In case of no obstacle being detected, the robot keeps moving forward either indefinitely or until it

reaches X cm, depending on which command it was given, and the program again goes back to

checking for another array of characters (i.e. it waits for a new command).

Whenever “Pick up object” command is given, the robot moves forward by a distance of 5 cm with

its gripper opened and the object becomes situated within pickup zone of the robot. The robot then

picks the object according to its category of weight. By default the object category is set to “very

very light”, which is for objects of weight within 0 g to 170 g. There are six specific commands for

object categorization. These six commands can be given during execution of both “move forward”

26
and “move forward X cm” functions, or after the “stop” function is executed. “Object is X kg” sets

the object according to the category it belongs to. Where ‘X’ is any variable between 1 to 1000 g.

The Force Sensitive Resistor (FSR) is set at the gripper. Whenever the robot is required to pick up

the object according to its weight category, it measures the amount of force required to pick up the

object. We have set some specific values of FSR for different weight categories. As we are using a

microcontroller to measure the force amount, we have applied analog input values to set the force

category. The robot will continue to grip the object until it gets the required force set for its category.

For example if we give the command “Object is heavy” and then give the command “Pick up the

object”, the robot will squeeze the gripper having FSR until it gets the analog input value between

480-500 which is given by the Force Sensitive Resistor and read by the microcontroller. More

explanation of FSR and its connection configuration with the microcontroller is given in the next

section, equipment and schematic diagrams.

TABLE- 4.1. OBJECT CATEGORIZATION TABLE A

Object is very Object is very


Command Object is light
very light light

Weight
Category 0-170 171-340 341-510
(gram)

FSR values
(analog input 950-970 780-820 600-620
value)

TABLE- 4.2. OBJECT CATEGORIZATION TABLE B

Object is very Object is very very


Command Object is heavy
heavy heavy

Weight
Category 511-680 681-850 851-1000
(gram)

FSR values
(analog input 480-500 320-340 180-200
value)

27
4.4 Equipment and Schematic Diagrams

The following electrical equipment were used in this project.

 Microcontroller (Arduino Mega 2560)

 Sonar Module (hc sr04)

 Bluetooth Module (hc 05)

 DC motor N20(200 rpm) for wheel

 DC motor N20(100 rpm) for gripper

 Force Sensitive Resistor (FSR 400)

 Motor Driver L293D

The Schematic diagram of the microcontroller part is divided into two blocks (as shown in Fig. 4.3),

and schematic diagrams of the two individual blocks are also given on the following page.

Fig. 4.3. Block diagram of robot circuit

28
Block A shows the connections between microcontroller and Force Sensitive Resistor (FSR),

Bluetooth Module, Sonar module (Fig. 4.4). Block B shows the connections between microcontroller

and the two motor drivers which control Wheel motors and Gripper motors (Fig. 4.5).

Fig. 4.4. Schematic diagram of Block A

Fig. 4.5. Schematic diagram of Block B

29
The Ultrasonic Sensor Sonar module (hc Sr04) is used to measure the frontal distance of any object

from the robot. It helps the robot navigate and informs us if any object is within the vicinity of the

robot. It also confirms whether the object is within the pickup zone of the robot. Bluetooth module

(hc 05) establishes connection between the android application and microcontroller to transmit data

from the app as an array of characters. FSR 400 is used to measure the amount of force applied while

grabbing an object, in order to pick it up. Two L293D motor drivers are used to control the four

motors of the gripper and wheels.

4.5 Summary
In this chapter, we have described how the system works. The sequence of events resulting in a

verbal command being processed, sent to the microcontroller, and then being executed has been

described in detail in this chapter. We have also given diagrams of the workflow and the algorithm

of command execution, the robot circuit and the equipment used. Detailed description of the

equipment used will be given in following chapters.

30
Chapter 7
Modules used in this system

31
7.1 Introduction
The different modules used in our system and their functions are described in this chapter.

The following modules have been used in the construction of the robot:

 Sonar Module (hc sr04)

 Bluetooth Module (hc 05)

 Force Sensitive Resistor (FSR 400)

 Motor Driver L293D

7.2 Ultrasonic Ranging Module HC - SR04


Ultrasonic ranging module HC - SR04 provides 2cm - 400cm non-contact measurement function,

the ranging accuracy can reach to 3mm. The module includes ultrasonic transmitters, receiver and

control circuit.

The basic principle of work:

(1) Using IO trigger for at least 10us high level signal,

(2) The Module automatically sends eight 40 kHz and detect whether there is a pulse signal back.

(3) If the signal comes back, through high level, time of high output IO duration is the time from

sending ultrasonic to returning.

Test distance = (high level time×velocity of sound (340M/S) / 2

There are only four pins that you need to worry about on the HC-SR04: VCC (Power), Trig

(Trigger), Echo (Receive), and GND (Ground).

Wire connections are as follows:

 5V Supply

 Trigger Pulse Input

 Echo Pulse Output

 0V Ground
32
Fig. 7.1. Sonar sensor module hc-sr04

7.3 Bluetooth Module (hc-05)


The HC-05 is a class 2 Bluetooth module designed for transparent wireless serial communication. It

is pre-configured as a slave Bluetooth device. Once it is paired to a master Bluetooth device such as

PC, smart phones and tablet, its operation becomes transparent to the user. No user code specific to

the Bluetooth module is needed at all in the user microcontroller program.

The HC-05 supports two work modes: Command and Data mode. The work mode of the HC-05 can

be switched by the onboard push button. The HC-05 is put in Command mode if the push button is

activated. In Command mode, user can change the system parameters (e.g. pin code, baud rate, etc)

using host controller itself of a PC running terminal software using a serial to TTL converter. Any

changes made to system parameters will be retained even after power is removed. Power cycle the

HC-05 will set it back to Data Mode. Transparent UART data transfer with a connected remote

device occurs only while in Data Mode.

33
The HC-05 can be re-configured by the user to work as a master Bluetooth device using a set of AT

commands. Once configured as master, it can automatically pair with a HC-05 in its default slave

configuaration or a HC-06 module, allowing a point to point serial communications.

The HC-05 will work with supply voltage of 3.6VDC to 6VDC, however, the logic level of RXD

pin is 3.3V and is not 5V tolerant. It can be damaged if connect directly to a 5V device (e.g Arduino

Uno and Mega). A Logic Level Converter is recommended to protect the HC-05. The power to the

HC-05 will cut off if the "EN" pin is pulled to logic 0.

Features:

 Bluetooth v2.0+EDR

 2.4GHz ISM band frequency

 Supported baud rate: 9600, 19200, 38400 (default), 57600, 115200, 230400, and 460800.

 Speed: Asynchronous: 2.1Mbps (Max) / 160 kbps, Synchronous: 1Mbps/1Mbps

 Power supply: 3.6V to 6V DC

 Passkey: 1234

Fig. 7.2. Bluetooth module hc-05

34
7.4 Force Sensitive Resistor (FSR 400)
The model 400 FSR is a single-zone Force Sensing Resistor optimized for use in human touch

control of electronic devices such as automotive electronics, medical systems, and in industrial

and robotics applications. FSRs are two-wire devices. They are robust polymer thick film

(PTF) sensors that exhibit a decrease in resistance with increase in force applied to the surface

of the sensor. It has a 5.1mm diameter active area and is available in 4 connection options.

Features and Benefits:

 Actuation force as low as 0.1N and sensitivity range to 20N

 Easily customizable to a wide range of sizes

 Cost-effective

 Ultra-thin

 Robust: Up to 10 million actuations

 Simple and easy to integrate

Applications:

 Detect & Qualify Press

Sense whether a touch is accidental or intended, by reading force

 Use force for UI feedback

Detect more or less user force, to make a more intuitive interface

 Enhance tool safety

Differentiate a grip from a touch, as a safety lock

 Find centroid of force

Use multiple sensors to determine centroid of force

 Detect presence, position or motion

of a person or patient in a bed, chair or medical device

35
 Detect liquid blockage

Detect tube or pump occlusion or blockage by measuring back pressure

 Many other force change detection applications

Fig. 7.3. Force sensitive resistor FSR-400

7.5 Motor Driver L293D


L293D is a dual H-bridge motor driver integrated circuit (IC). Motor drivers act as current

amplifiers since they take a low-current control signal and provide a higher-current signal. This

higher current signal is used to drive the motors.

L293D contains two inbuilt H-bridge driver circuits. In its common mode of operation, two DC

motors can be driven simultaneously, both in forward and reverse direction. The motor operations

of two motors can be controlled by input logic at pins 2 & 7 and 10 & 15. Input logic 00 or 11 will

stop the corresponding motor. Logic 01 and 10 will rotate it in clockwise and anticlockwise

directions, respectively.

Enable pins 1 and 9 (corresponding to the two motors) must be high for motors to start operating.

When an enable input is high, the associated driver gets enabled. As a result, the outputs become
36
active and work in phase with their inputs. Similarly, when the enable input is low, that driver is

disabled, and their outputs are off and in the high-impedance state.

Fig. 7.4. Pin diagram

7.5.1 Dual H-bridge logic


An H-bridge is an electronic circuit that enables a voltage to be applied across a load in either

direction. These circuits are often used in robotics and other applications to allow DC motors to run

forwards and backwards.

Fig. 7.5. H-bridge circuit

37
7.5.2 Table of operation

Fig. 7.6. Operation of H-bridge in L293D

7.6 Summary
In this chapter, we have described the different modules used in our system – sonar module,

Bluetooth module, force sensitive resistor and motor driver L293D, as clearly and briefly as

possible. The following chapter describes the mechanical construction of the robot.

38
 

Chapter 8
Mechanical Description

39
8.1 Introduction

The mechanical construction of the robot is described in detail in this chapter. The mechanical part

of the robot has been designed and made from scratch entirely by us. We designed a unique gripper

that is capable of gripping and lifting objects and dropping them. The 1 st model of the robot was

built using PVC sheets. We have used 5 mm and 3 mm thick PVC boards for the construction of the

body. The final model was built using wooden boards which is quite flexible and the robot was not

fully well built. The mechanical measurements and explanations of the constructions are discussed

below.

8.2 Mechanical measurements and explanations of 1st model

The robot has two bases, namely upper base and lower base, situated 6.3 cm apart from each other.

The two bases are connected via base joints. The electrical components are placed on the lower base.

The gripper needs two n20 (100 rpm screw shaft) motors. One motor (Gripper motor 1 in Fig. 8.1)

acts for gripping the object and another motor (Gripper motor 2 in Fig. 8.1) helps in lifting the

object. The length of the screw shaft of the 100 rpm motor is around 3.3 cm. When the screw shaft of

Gripper motor 2 rotates, the screw moves forward and backward respectively, which in turn moves

Joint 1. When Joint 1 is moved forward and backwards, the Lift Hand 1 moves up and down

respectively, which creates a rotation of Lift Hand 2 about the Pivot 2 resulting in lifting the object

up and down as required. Lift Hand 1 and Lift Hand 2 are connected by pivot 1.

40
Fig. 8.1. Side view of the robot (Diagram drawn using SketchUp)

Lift Hand 2 is connected to the upper base of the robot at the Pivot 2 joint. As there is a 6.3 cm gap

between the two bases, there is sufficient space for the electrical equipment to be placed between the

upper base and the lower base. One castor ball is placed at the front to allow free movement of the

robot.

The sonar module placed on the lower base determines the distance of any object from the robot. At

normal condition, if the distance of an object is around 5 cm from the sonar, that object is selected as

an eligible candidate for lifting operation. There are also two 200 n20 motors for wheel rotation. In

Fig. 8.2 we can see the upper view of the robot. The width of the robot is 18 centimeters and length

of the robot is 22 centimeters without the gripper. With the gripper, the length of the robot is around

35 centimeters.

41
Fig

. 8.2. Upper view of the robot (Diagram drawn using SketchUp)

While researching for the mechanical part of our robot, we came across many existing gripper

constructions. One such construction has the capability to grasp irregular objects [4]. However the

construction of such a gripper required more space and mechanical work. One of the papers also

proposed pneumatic muscles actuated gripper [5]. But it had limitations of grasping irregular objects.

Here in Fig. 8.3 we have proposed a very simple design of gripper which has the capability of

holding limited sized irregular objects. In Fig. 8.3, the motor is situated in the gripper base.

Whenever the screw shaft of the motor rotates, the screw moves forward and backward, which in

turn makes Hand 3 move up and down. Hand 3 is connected to the Hand 1 at Pivot 2. Whenever

Hand 3 moves, it makes Hand 1 rotate around Pivot 1, which makes Hand 2 move accordingly.

42
Fig. 8.3 The gripper construction (Diagram drawn using SketchUp)

Hand 2 has foam grip attached to it which provides more gripping friction to allow the gripper to

grab objects better. Hand 2 is flexible and rotatable around Pivot 3 which helps in gripping irregular

objects. As the construction is symmetrical, this creates gripping action even towards irregular

objects.

43
Fig. 8.4 Model 1 trying to pick up object.

8.3 Overview of 2nd model and 3rd model


The second model was built based on wooden boards. The model had mechanical tank track system.

The model had a total weight of 11 Kg. It was mechanically heavy and was able to maneuver in sand

due to tank track system. The mechanical arm was built using wood. The second model had seven

degree of freedom. The second model had 1 arm. The arm had 5 degree of freedom. The 3 rd model

had 12 degree of freedom. Each arm having 5 degree of freedom itself. Other than the arm the robot

had two strong motors on wheels for locomotion. The 2nd model was around 18 kg in total weight.

44
Fig. 8.5 The robotic arm 2nd version upper view and side view..

Fig. 8.6 2nd model and 3rd model of the robot.


45
8.4 Summary
In this chapter, the mechanical construction of the robot has been described in detail. Detailed

explanation of how each of the mechanical components works has also been given here.

46
 

47
Chapter 11
Results and Discussion

48
11.1 Introduction
The results and findings of our project are discussed in this chapter. We have solved the equations

derived from Inverse square law of sound propagation of sound attenuation to find the distance of

the sound source in 3D-coordinate system using three microphones. Also, we have compared the

speech recognition performance of our self-developed software with that of Google Speech

Recognition API.

11.2 Result and findings of sound source localization


We calculated the Equation and auto correlation to get the data from the single microphone.

Similarly we calculated auto correlation for the 3 microphone. Then we applies Fourier

Transformation in order to get clear frequency which gave us significantly similar frequency

readings with only 3% error.

Fig. 11.1 Sound raw data analysis and Amplitude detection

49
In the Fig. 11.2, the most above one is the result data from microphone one, second one is the result

data from microphone 2, and the lowest one is the result data from microphone 3.

Fig. 11.2 Data from 3 microphones

At first we calculated the frequency through auto Correlation and fast Fourier transformation to get

the resultant frequency. We took the average of the three frequencies to get the resultant frequency.

TABLE - 11.1 FREQUENCY ANALYSIS TABLE


Real Frequency Table of Microphones
Frequency
(Hz) Microphone 1 Microphone 2 Microphone 3

130 127.8 129.2 133.5

50 51 52 50

75 72 70 67

180 183 184 182

250 251 252 255

190 191 190 193

110 111 115 119

50
Then we calculated the coordinate from the three data to get the following result

TABLE - 11.2 COORDINATE ANALYSIS TABLE

Real (X,Y,Z) Coordinate Of Source


In cm X X z
200,110,30 205 105 32

40,50,30 40 52 32

111,212,33 107 205 32

300,400,55 290 398 49

400,550,60 389 488 58

600,300,56 592 304 57

500,550,60 589 488 58

600,520,60 689 510 59

700,550,60 789 488 55

770,550,60 772 488 58

From the above finding we can conclude that with greater distance accuracy decreases, yet we can

say that the accuracy is 94%.

11.3 Result and findings of speech recognition


We have created an android application using Google speech recognition module. There are many

other existing speech recognition software, but Google speech response was comparatively better,

which is why we used it. Some of the researches we came across included robot interaction for

speech development [7] and some other researches included robotic arm control using speech

processing [10]. There were also researches which included control of operating systems using

speech recognition [12]. Some examples of such researches included opening and closing of cd

driver, navigation through folder in an operating system.

Out of all these researches we chose Google speech recognition because evaluation of Google speech

was better compared to the other speech recognition systems [14]. We gave 300 commands set by

51
our command library by 10 people. Out of these 267 commands were accurate giving us the

accuracy rate for fixed rate commands around 89%.

The robot’s response to different weighted objects was not same and due to different heaviness, time

required to lift the object was different for different objects using the same motor. The current rating

was different for different categories. Considering different weighted object categories and through

taking 10 objects of each weight category the time required to lift the objects were calculated and

average was measured to achieve the amount of time required by the motor to lift the objects.

TABLE - 11.3 TIME REQUIRED TO LIFT DIFFERENT OBJECT CATEGORY

Object
Very very light Very light Light
Category

Weight
Category 0-170 171-340 341-510
(gram)

Time required
3 6 8
to lift (sec)

TABLE - 11.4 TIME REQUIRED TO LIFT DIFFERENT OBJECT CATEGORY

Command Heavy Very heavy Very very heavy

Weight
Category 511-680 681-850 851-1000
(gram)

Time required
9 11 13
to lift (sec)

Other than a few exceptions, the robot was able to lift almost all objects. Our limitation for object

size is 15x15x20 cm3, i.e. object must fit into a square having surface area less than 225 cm 2, and the

height of the object must be less than 20 cm to be qualified as a candidate for lifting operation. In

Fig. 11.3 we can see the robot lifting a “very heavy” category object. In Fig. 11.4 we see the robot is

lifting “very very light” category object. It was easily able to transport from one place to another

through verbal command. In Fig. 11.5 we see the robot is lifting an object in the weight category

“very very heavy”.

52
Fig. 11.3. Robot lifting “very heavy” category object

Fig. 11.4. Robot trying to lift “very very light” category object

Fig. 11.5. Robot lifting “very very heavy” category object

53
11.3.1 Limitations
The robot has some limitations while lifting objects. It cannot lift all objects. As discussed before,

this maximum weight of object this robot can lift is 1 kg. Also, the dimensions of the object to be

lifted must fit into 15x15x20 cubic centimeters i.e. the object must fit into a square having surface

area of less than 225 square centimeters, and the height of the object must be less than 20 centimeters

to be qualified as a candidate for lifting operation.

The robot was has been made using PVC sheet. As PVC sheets are not strong enough, sometimes

while lifting heavy objects, the sheets seem to bend. We intend to build the robot using stronger

materials and overcome the bending issue.

The robot uses DC motor and measures distance using SONAR technology. While in motion, the

robot does not always stop at the exact instant when it requires to stop due to inertia. However at

this point that is not a major problem.

11.4 Summary
In this chapter, we have described and discussed the results of our project, which involved locating

the sound source and the construction of speech responsive robot with object categorization.

54
 

Chapter 12
Conclusion

55
In this project, we have focused on two things – sound source localization and building a speech

responsive mobile robot with object categorization.

Firstly, we were able to locate the sound source in 3D-coordinate system. There is scope for

extending this research in the future. For example – an autonomous robot that can detect the person

speaking in respect to the robots position – in darkness, through only sound analysis location can be

found for the loud sound using such a system.

We have also successfully constructed the robot, and designed and built a unique gripper for the

robot’s hands. The robot has the capability to respond to speech and transport limited sized objects

using Google Speech recognition. The accuracy rate for successful response towards speech was

89%. The accuracy rate for successful completion of function was 94%. There were hardly accidents

occurring as it navigated using sonar. Whenever there was an object at very close vicinity, the robot

successfully detected the obstacle and stopped.

We hope to build a stronger, bigger and more robust robot in the future which can lift more heavy

objects. This robot can be very useful for the disabled, as it will provide them with a useful means of

doing daily work with ease, simply by giving verbal commands for transportation of objects

according to their need. Also, a robot that can help us do our work easily without moving our hands

would be industrially profitable. It can even aid people in difficult situations such as rescue

operations in disaster zones, or to help doctors in medical procedures as a third hand.

Thus, development in this field could open up boundless possibilities and new applications that can

be very useful and have a great impact on people’s lives.

56
 

Bibliography

57
[1] Norberto Pires, J. (2005). Robot-by-voice: Experiments on commanding an industrial robot

using the human voice. Industrial Robot: An International Journal, 32(6), 505-511.

[2] Jain, R., & Saxena, S. K. (2011). Voice Automated Mobile Robot.  International Journal of

Computer Applications, 16(2), 32-35.

[3] Abed, A. A. Design of Voice Controlled Smart Wheelchair.

58
Appendices

59
Appendix A
Codes Related to Arduino

60
For example –
Codes Related to Arduino
This code is in c plus plus

#include <SoftwareSerial.h>
#include <Servo.h>

Servo myservo;
#define RxD 53

61

You might also like