AI 417 CLASS X
AI 417 CLASS X
Introduction
In the present time, a thorough knowledge
of language with communication skills
is very important in any occupation or
business. As a student, you may study
any language, but it is important that you
are able to read, write, speak and listen
well in order to communicate properly.
Speaking more than one language can
help you to communicate well with people
around the world. Learning English can
help you to communicate with people who
understand English besides the mother
tongue i.e., the language one has been
exposed to since birth.
Information/ Behaviour/
Input Output
The Channel—
What speaking, writing
I mean graphic, video, etc. What
I understand
The The
At least some
Messenger Recepient
code in common
Sender sends a
message
Giving Information
Message
(Encoding) channel
Sender Receiver
Communication A Channel is
starts with used to transfer
sender the message
Reply to Sender
(Encoding)
Communication Skills 3
Practical Exercises
The teacher will facilitate these activities by showing you the
e-learning lesson at http://www.psscive.ac.in/stud_text_book.
html. This will include videos and e-content for the above topics
as well as detailed instructions for some activities below.
Initial Thinking Activity
After watching the initial video in the e-learning lesson for
this topic, write the answer of the following question: Why is it
important to communicate effectively?
Activity 1
Role Play on Communication
Procedure
• Form groups with four students in each group.
• The situation is that a student is Sales Executive at a
toy store and he or she is supposed to communicate to
customers about the various types of toys available with
the store for different age group.
• The other students will reach the Sales Executive one by
one and ask different types of questions related to toys.
• Develop a script for the role play and act on the same.
• Discuss what you all learned from this activity.
Communication Skills 5
B. Subjective question
Communication Skills 7
Practical Exercise
The teacher will facilitate these activities by showing you the
e-Learning lesson at http://www.psscive.ac.in/stud_text_book.
html. This will include videos and e-content for the above topics
as well as detailed instructions for some activities below.
Activity 1
Group-Practice: Role Play of a Telephonic Conversation
Material required
Notebook, pen
Procedure
• Form groups with three students in each group.
• Write a phone conversation based on a given scenario of a
student calling a university academic coordinator to know
about study courses and admission procedure.
• One student acts as caller and the other as receiver.
• Read out the conversation by enacting the roles.
• The third student gives feedback based on the 7Cs of
communication (clear, concise, concrete, correct, coherent,
complete and courteous).
Activity 2
Group-Practice on Public Speaking
Material required
Notebook, pen
Procedure
• Form groups with three students in each group.
• Within the group, choose a topic for a short speech.
For example, Importance of Punctuality, Healthy Food
Habits, etc.
• Each person should make a speech to the others in the
group; who then give feedback based on whether the
person was able to communicate properly.
• One student from the group volunteers to give the same
speech in front of the class.
Communication Skills 9
B. Subjective question
Communication Skills 11
Visual Communication
Visual communication proves to be effective since it
involves interchanging messages only through images
or pictures and therefore, you do not need to know any
particular language for understanding it. It is simple
and remains consistent across different places. Some
common types of visual communication are shown in
Table 1.6.
Table 1.6: Examples of Visual Communication
Visual Communication: Exchanging Information through Images
Under construction No pets allowed
Communication Skills 13
Practical Exercises
The teacher will facilitate these activities by showing you the
e-learning lesson at http://www.psscive.ac.in/stud_text_book.
html. This will include videos and e-content for the above topics
as well as detailed instructions for some activities below.
Initial Thinking Activity
After watching the initial video in the e-learning lesson for this
topic write down how could Rohit understand something was
wrong with Amar? Can you understand how your friends are
feeling even when they do not tell you anything?
Activity 1
Group-Practice: Role-play on Non-verbal Communication
Material required
Notebook, pen
Procedure
• Form groups with three students in each group.
• Prepare the script for the role play, based on the given
scenario. For example, a hearing impaired salesperson is
attending a female customer at an apparel store.
• Act it out in front of your group.
• One group volunteers to act before your whole class.
Discuss how students used non-verbal communication.
Was this communication effective?
Activity 3
Individual-Practice: Comparing Methods of Communication
Material required
Notebook, pen
Procedure
• Discuss the three methods of communication (Verbal,
Non-verbal and Visual).
• Ask each student to write a list of the advantages and
disadvantages of each method.
• Practice: In all your conversations at home and school,
pay attention to the non-verbal signs others are using.
Practice using the non-verbal methods you learnt here in
the right manner.
Communication Skills 15
B. Subjective question
Feedback
Feedback, if shared properly, can help reinforce existing
strengths and can increase the recipient’s abilities to
Communication Skills 17
Importance of Feedback
Feedback is the final component and one of the most
important factors in the process of communication since
it is defined as the response given by the receiver to
the sender. Let us look at certain reasons why feedback
is important.
• It validates effective listening: The person
providing the feedback knows they have been
understood (or received) and that their feedback
provides some value.
• It motivates: Feedback can motivate people to
build better work relationships and continue the
good work that is being appreciated.
• It is always there: Every time you speak to
a person, we communicate feedback so it is
impossible not to provide one.
• It boosts learning: Feedback is important to
remain focussed on goals, plan better and develop
improved products and services.
• It improves performance: Feedback can
help to form better decisions to improve and
increase performance.
Activity 1
Role Play on Providing Feedback
Material required
Notebook, pen
Procedure
• Form groups with five students in each group.
• Two volunteers in the group should act out a role play
of a hotel staff. For example, Volunteer A can act as a
front desk executive and Volunteer B as a guest enquiring
availability of rooms.
• After the role play, remaining members of group will give
constructive feedback to both the volunteers.
Activity 2
Group-Practice on Constructive Feedback
Material required
Notebook, pen
Procedure
• Form groups with five students in each group.
• Each member in the group should write down three
sentences showing how feedback should NOT be given.
• Then, each group forms a circle. One person in the circle
starts by saying a sentence or feedback. The next person
in the circle tries to make the feedback more constructive.
• Keep repeating until all written feedback have
constructive alternatives.
Communication Skills 19
B. Subjective question
Linguistic Cultural
Barriers to
Communication
Physical and
Interpersonal
Organisational
Linguistic Barriers
The inability to communicate using a language is known
as language barrier to communication. Language
barriers are the most common communication
barriers, which cause misunderstandings and
Communication Skills 21
Practical Exercise
Activity 1
Role Play on Barriers to Effective Communication.
Material required
Notebook, pen
Procedure
• Form groups with five students in each group.
• Two volunteers from the group should act out a role play of
a salesperson in a shopping mall. For example, Volunteer
A can act as a sales executive, and Volunteer B as a
customer enquiring about a television set. The customer
is from a foreign country.
• Enact the communication barriers or challenges the
customer or salesperson may face while interacting with
each other.
Activity 2
Group practice: Overcoming Barriers
Material required
Notebook, pen
Procedure
• Form groups with five students in each group.
• Each member in a group should write down three ways
to overcome barriers to effective communication. The
group members will then stand in a circle. Each student
should say aloud one point each; till all the ways have
been discussed.
Communication Skills 23
B. Subjective question
Capitalisation
We know that all sentences begin with capital letters.
However, there are certain other points in a sentence
where we should use capital letters. ‘TINS’is a set of
24 Employability Skills – Class X
Punctuation
Certain set of marks, such as full stop, comma, question
mark, exclamation mark and apostrophe are used in
communication to separate parts of a sentence for
better clarity of message. Some common punctuation
marks and their rules are shown here in Table 1.8.
Table 1.8: Punctuation Marks
Punctuation Sign Use Example
name
Full stop . • Used at the end of a Omar is a professor. His students
sentence. call him Prof. Omar.
• Used with short form of
long words.
Comma , • Used to indicate a pause After getting down from the bus, I
in the sentence. walked towards my school.
• Used to separate two or The grocery store had fresh
more items in a row. kiwis,strawberries and mangoes.
Question mark ? • Used at the end of a Where is your book?
question.
Exclamation mark ! • Used at the end of a What a beautiful dress!
word or a sentence to Hooray! We won the match.
indicate a strong feeling.
Communication Skills 25
Adverbs Pronouns
(slowly, (He, she,
quickly, etc.) you, I)
Adjectives
Verbs (Walk, (Salty, Spicy,
Talk, etc.) etc.)
Let us now see how these words are used. Read aloud
the sentence given below.
Hooray! Shyam and his team won the exciting
match yesterday.
We already know that Shyam, team and match are
nouns. ‘Exciting’ is an adjective here because it describes
the noun match, the word won is a verb because its
hows an action and the word ‘yesterday’ is an adverb
because it describes when they won the match.
But what about the remaining words in this sentence:
Hooray, the, and? Such supporting words are used to
join the main parts of speech together and also to add
information to the sentences. Let us now look at some
types of these supporting words.
Supporting Parts of Speech Types
Articles (a,
an, the)
Interjection
Conjunctions
(wow, oh no,
(and, but, etc.)
etc.)
Preposition
(in, on, etc.)
Communication Skills 27
Practical Exercise
The teacher will facilitate these activities – by showing you the
e-learning lesson at http://www.psscive.ac.in/stud_text_book.
html. This will include videos and e-content for the above topics
as well as detailed instructions for some activities below.
Initial Thinking Activity
After watching the initial video in the e-learning lesson for
this topic, write down what do you think was wrong with
Seema’s letter?
Activity 1
Identifying Parts of Speech
Material required
Notebook, pen
Procedure
• Form groups with five students in each group.
• In the paragraph given below (taken from from ‘La Bamba’—
a short story; Gary Soto pp. 115), identify the different
parts of speech and write them down accordingly.
“manuel walked on stage and the song started immediately
glassy-eyed from the shock of being in front of so many
Activity 2
Pair Activity: Sentence Construction
Material required
Notebook, pen
Procedure
• Form pairs of students.
• List out nine parts of speech that you learnt in the lesson.
Select any three of them and create five simple sentences
which use these parts of speech.
• For each part of speech, a volunteer reads out their
sentences. The other students share if it is correct.
Activity 3
Group Practice: Identify Name, Place, Animal, Thing
Material required
Notepad and pens
Procedure
• Number yourselves from 1 to 5.
• One set of 1–5 is in one group and so on.
• Each member of a group has to say a word that is either a
name, place, animal, thing or feeling; the fifth member has
to perform any kind of action.
• Each group gets 30 seconds to think what they are going
to say and do.
Discussion
The class discussion will highlight different words that are used
to name a person, place, animal, thing, or feeling and their role in
a sentence as parts of speech. The discussion will also highlight
the role of action words as parts of speech
Communication Skills 29
Nouns Verbs
Boy, Ms Sen, Rahim, Children, Swimming, Driving, Writing,
Cat, Students Teaching, Eating, Playing
C. Subjective question
Communication Skills 31
Types of Objects
In a sentence, there can be two types of objects — Direct
and Indirect. The objects provided in the above
examples are called direct objects since they are
directly ‘acted on’ by the verb. On the other hand,
an indirect object answers questions, such as ‘to/
for who.’
For example, in the sentence “She bought a bicycle
for her son.” The verb is ‘bought’.
What did she buy? A bicycle. For who? For her
son. Here, ‘bicycle’ is the direct object and ‘her son’
is the indirect object. Some sentences only have
direct objects while some have both direct and
indirect objects.
Read aloud the examples given in Table 1.12 and
practice finding the direct and indirect objects.
Table 1.12: Direct and Indirect Objects
Communication Skills 33
Introduction
Self-management, also referred to as ‘self-control,’
is the ability to control one’s emotions, thoughts and
behaviour effectively in different situations. This also
includes motivating oneself, and setting goals. People
with strong self-management skills are better in doing
certain things better than others.
Therefore, employers too strongly prefer
people with good self-management skills.
Basics of Self-management
To perform well at work and life in
general, you must be able to manage
and improve yourself in various skills
including discipline and timeliness,
goal-setting, problem solving, teamwork,
professionalism, etc. Once you develop
your personality and abilities in these
areas,you will be able to succeed in
personal as well as professional life.
Figure 2.1 Self-management
What is Stress?
Stress can be defined as our emotional,
mental, physical and social reaction to any
perceived demands or threats. These demands
or threats are called stressors. Stressors are
Figure 2.2 Stress the reason for stress.
Stress Management
Stress is a part of everyday life. There are many
instances when stress can be helpful. A fire alarm
is intended to cause stress that alerts you to avoid
danger. The stress created by a deadline to finish a
paper can motivate you to finish the assignment on
time. But when experienced in excess or for a long
period of time, stress has the opposite effect. It can
harm our emotional and physical health, and limit our
ability to function well at home, in school and within
our relationships.
Managing stress is about making a plan to be able to
cope effectively with daily pressures. The ultimate goal
is to strike a balance between life, work, relationships,
relaxation and fun. By doing this, you are able to deal
with daily stress triggers and meet these challenges
head on.
Always keep in mind the ABC of stress management
A: Adversity or the stressful event
B: Beliefs or the way you respond to the event
C: Consequences or actions and outcomes of the event
Self-management Skills 41
Management Techniques
Here are a few simple stress management techniques.
• Time management: Proper time management
is one of the most effective stress-relieving
techniques.
• Physical exercise and fresh air: A healthy lifestyle
is essential for students. Stress is generally lower
in people who maintain a healthy routine. Doing
yoga, meditation and deep breathing exercises
help in proper blood circulation and relaxes the
body. Even taking a walk or playing in the park
will help you get a lot of fresh oxygen, which will
help you become more active.
• Healthy diet: Having a healthy diet will also help
you reduce stress. Eating a balanced diet, such
as Dal, Roti, vegetables and fruits will give you
the strength to do your daily work efficiently.
• Positivity: Focussing on negative aspects of life
will add more stress. Instead, learn to look at
the good things and stay positive. For example,
instead of feeling upset over a scoring less in a
test, try to maintain a positive attitude and look
at ways to improve the next time.
Emotional Intelligence
Emotional intelligence is the ability to identify and
manage one’s own emotions, as well as the emotions
of others. It is generally said to include at least
three skills:
• Emotional awareness : the ability to identify and
name one’s own emotions.
• Harnessing emotions : the ability to harness
and apply emotions to tasks like thinking and
problem solving.
• Managing emotions : the ability to regulate one’s
own emotions when necessary and help others to
do the same.
Knowing how to manage one’s emotions is critical
for all of us. You can manage stress, keep your brain
Self-management Skills 43
Practical Excercise
The teacher will facilitate these activities by showing you the
e-learning module for this lesson via http://www.psscive.ac.in/
Employability_Skills.html. The module will include videos and
e-content for the above topics as well as detailed instructions for
some activities below.
• After watching the video ‘Have you faced this situation?’ in
the e-learning lesson, discuss what you have learnt from
the video. Do you think Priya was worried that she will
not meet her goal? What would you do differently in her
situation?
• After watching the video ‘Managing Stress at Work’ in the
e-learning lesson, discuss the various stress management
techniques that were used in the video by Gaurav to
improve his situation.
Activity 2
Self-reflection
Material required
Pen or pencil
Procedure
• Complete the below table by listing the situation(s) that
can cause stress and what will you do to avoid stress in
such situations.
• Use the stress management techniques shared in the
lesson to complete the exercise.
Stress Management
Stress Causing Situation(s)
Techniques
Activity 3
Benefits of taking a holiday
Material required
Pen or pencil
Procedure
• Write an essay to describe the place and your experience
during a holiday trip or summer camp.
• Highlight how the trip helped you de-stress.
Self-management Skills 45
Knowing Yourself
Understanding who you are, what
Beliefs you like or dislike, what are your
beliefs, what are your opinions,
what is your background, what
Background
you do well and what you do not
do well is important because only
Who am I? Opinions then can you actually measure
your strengths and weaknesses
(see Figure 2.4).
Likes/dislikes
Strength and Weakness
Analysis
Values
Understanding who you are
Figure 2.4 Knowing Yourself means looking outside your usual
Examples of strengths
• I am good at creative writing.
• I am confident of speaking in front of an audience.
• I play guitar very well.
Examples of weaknesses
• I find it difficult to solve mathematics problems.
• I would like to speak English fluently.
• I do not like to lose in any game or sports.
Self-management Skills 47
Finding Weaknesses
• Point out the areas where you struggle and the
things you find difficult to do.
• Look at the feedback others usually give you.
• Be open to feedback and accept your weaknesses
without feeling low about it. Take it as an area
of improvement.
You can find your strengths and weaknesses once
you find answers to the questions given here.
• How am I different from others?
• What do I do better than others?
• What do other people admire in me?
• What makes me stand out?
• Where do I worry and struggles?
• Where, how and why do others perform better
than me?
• What advice for improvement do I often receive
from others?
Activity 1
Pair Activity: Aim in Life
Material required
Pen, notepad or sheets of paper
Procedure
• Form pairs of students.
• Each student will make a list of things that they can do
well based on the given format.
• Share your notes with your partner.
• One volunteer from the pair comes and reads, in front of
the class.
Here is the format for you to fill in
I am
I can (abilities)
I will (plan)
My aim is
Activity 2
Individual Activity: Interests and Abilities Worksheet
Material required
Student textbooks, pen
Procedure
• Each student has to complete the given worksheet,
containing a list of statements and questions.
• Each student has to be real and honest when filling
the worksheet as it is for their own understanding
of themselves.
• If they are not real and honest, they will get incorrect
results about their own interests and abilities.
Worksheet - My Interests and Abilities
I am happiest when
My idea of a perfect day
Self-management Skills 49
Types of Motivation
Internal Motivation: LOVE
We do things because they make us happy, healthy
and feel good. For example, when you perform on
your annual day function and you learn something
new, such as dancing, singing, etc., you feel good.
Self-management Skills 51
Know what they want Are focussed Know what is Are dedicated to fulfill
from life important their dreams
Figure 2.8: Qualities of self-motivated people
Building Self-motivation
There are four steps for building self-motivation, which
are as given below.
Stay loyal to
Develop a plan your goals
to achieve
your goals
Set and focus
on your goals Work towards
Find out your Plan and set timelines
achieving your goal,
strengths even when you are
to achieve your goals, facing difficult time.
Plan a list of activities For example, even
Define the goals that you will do to
you want to though I did not clear
achieve each goal. For the Hotel Management
Identify your likes achieve and example, after schooling,
focus all your entrance exam, I will
and dislikes. you may be required to find out other ways to
Understand what energy to achieve appear for a competitive
your goal. For become a chef.
makes you happy. examination to join Hotel
For example, I example, I want Management Institute.
love cooking. to be a chef.
Activity 1
Staying Motivated (Group Discussion)
Material required
Pen, notepad or sheets of paper, chart paper
Procedure
• Form groups of three.
• Choose any one of the following situation and write down
the steps you would take to motivate yourself.
• Your teacher gives you feedback on the essay you had
written. There are a lot negative remarks. What will
you do to motivate yourself to improve the essay?
• Your father has given you the responsibility of
arranging for a birthday party for your little sister who
is turning 3 years old. You do not want to do this task.
How will you motivate yourself to do the work?
Activity 2
Self Reflection
Material required
Pen or pencil
Procedure
• Make a list of reasons that stop you from being motivated.
• Write down ways by which you will motivate yourself to
overcome them.
Reasons for not Ways to overcome
being motivated
For example: People For example: I will learn to speak English
make fun of the way correctly by attending classes after school.
I speak English.
Self-management Skills 53
Self-management Skills 55
Practical Exercise
The teacher will facilitate these activities by showing you the
e-learning module for this lesson via http://www.psscive.ac.in/
Employability_Skills.html. The module will include videos and
e-content for the above topics as well as detailed instructions for
some activities given ahead.
• After watching the initial video ‘Introduction’ in the
e-learning lesson, discuss the in the class: Why did Amit
feel he was not prepared for the future?
• After watching the video ‘Setting SMART Goals’ in the
e-learning lesson, discuss what you have learnt from
the video.
Activity 2
Long-term Goals and Short-term Goals (Peer Feedback)
Material required
Pen, notepad or sheets of paper
Procedure
• Form groups of four. Work individually in your group and
complete the below table. Once completed, share with
your group and seek feedback on your goals. Share your
feedback when other members of the groups are presenting
their goals.
Short-term Goals (What are Long-term Goals (What are
your goals in the next 6 your goals in the next 5
months to 2 years?) years?)
1. 1.
2. 2.
3. 3.
Self-management Skills 57
Practical Exercise
The teacher will facilitate these activities by showing you the
e-learning module for this lesson via http://www.psscive.ac.in/
Employability_Skills.html. The module will include videos and
e-content for the above topics as well as detailed instructions for
some activities given below.
After watching the video ‘ Time Management’ in the e-learning
lesson, discuss — What you have learnt from the video? Which
steps of time management were followed in the video?
Activity 2
Managing your time to reach school on time
Material required
Pen
Procedure
• List out the to-do plan with timing to make sure you reach
school on time.
To-do List
1.
2.
3.
4.
Self-management Skills 61
Starting a Computer
What is the first thing you do after you wake up in
the morning? What if your father tells you to do your
homework immediately? Can you do it? Normally you
would do some daily activities and get ready before you
start working? Similarly, when a computer is switched
on, it performs some basic processes/functions before
it is ready to take instructions from the user.
To start a computer, press the Power button on the
CPU. This will start the operating system and display
the Ubuntu desktop as shown in Figure 3.4 or the main
Figure 3.4: Power Button screen on the monitor.
Function Keys
Keys labeled from F1 to F12 are function keys. You
use them to perform specific functions. Their functions
differ from program to program. The function of the F1
key in most programs is to get help on that program.
Some keyboards may have fewer function keys.
(a) Control keys: Keys, such as Control (CTRL),
SHIFT, SPACEBAR, ALT, CAPS LOCK and TAB,
are special control keys that perform special
functions depending on when and where they
are used.
(b) Enter key: The label on this key can be either
ENTER or RETURN, depending on the brand of
computer that you are using. You use the ENTER
or the RETURN key to move the cursor to the
beginning of a new line. In some programs, it is
used to send commands and to confirm a task on
a computer.
Double-click
Double-clicking means to quickly click the
left mouse button twice. When we double-
Figure 3.11 Drag and Drop click on a file, it will open the file.
Practical Exercise
The teacher will facilitate these activities by showing you the
e-Learning lesson athttp://www.psscive.ac.in/stud_text_book.
html ->Using a Computer. This will include videos and e-content
for the above topics as well as detailed instructions for some
activities below.
Initial Thinking Activity
After watching the initial video write what do you think happens
when you start a computer and enter data using a keyboard
and mouse?
Activity 1
Group Demo on Use of Computer
Material required
Pen, notebook, computer
Procedure
• Form groups depending on the number of computers
available. One student starts the computer and logs in.
• Another student identifies the keys on the keyboard.
A third student then performs all the functions of the
mouse such as hover, click, double-click, etc.
• Discuss and note differences between hardware and
software and also how they work together to perform a
task on the computer.
Activity 2
Group Practice: Using the Keyboard
Material required
Computer
Procedure
• Form groups depending on
the number of computers
available.
• Open a text editor in
Ubuntu by typing ‘editor’ Figure 3.12 Typing
in the search bar and then
selecting the Text Editor. You can also open Notepad in
Windows by typing Notepad on the Windows Search bar
and then selecting Notepad from the search result.
• One student positions his or her hands on the keyboard as
shown in Figure 3.12 and types the following paragraph
in the text editor.
“People use computers at work, at school and at home every day.
In factories computers are used to control the manufacturing
process and in offices to make documents, such as reports. We
also use computers for sending e-mails and playing games.”
Now, another student in the group will check the paragraph
and correct the grammar and spelling mistakes.
B. Subjective questions
1. What is the function of the ENTER key?
2. How will you prevent others from using your computer?
Figure 3.17: Choose the File Option Figure 3.18: File Explorer
Figure 3.19: Right-click on Desktop and Figure 3.20: Type Demo as the name of
click New Folder the new folder
Activity 1
Creating a folder
Material required
Pen, notebook, computer
Procedure
• Form groups depending on the number of computers
available. Each member of the group creates a new folder.
Others can watch and give feedback on what was done
correctly and what can be improved.
• Open a text editor in Ubuntu or Notepad in Windows
• Create a 2 folders Demo1 and Test1
• Now delete the folder Test1
Each group can study the following shortcut commands together.
CTRL+z — undo CTRL+c — copy
CTRL+y — redo CTRL+v — paste
CTRL+a — select all CTRL+p — print
CTRL+x — cut CTRL+s — save
B. Subjective questions
1. How is a computer file system similar to our physical file
system in a school?
2. What are the steps you will perform to save a text file in
Ubuntu?
Practical Exercise
Activity 1
Making a Chart
Material required
Pen, notebook, chart paper, pictures.
Procedure
• Form groups and make a chart to list down all the
ways in which a device can be damaged and how it can
be prevented.
• Make sure all students in the group get a chance
to participate.
B. Subjective questions
Threats to Computer
Threats are the ways in which personal
information can be leaked from a computer
without our knowing.
(a) Theft: Theft means stealing of
information or hardware. These maybe
of three types:
Figure 3.31: Physical stealing
• Physical: Where a person may
steal your desktop computer or
laptop.
• Identity: Where a hacker steals
your personal information and
assumes your identity. Using this
false identity, the hacker can gain
access to your account information
or perform illegal activity.
• Software Piracy: This is stealing
of software and includes using
Figure 3.32: Online stealing
or distributing unlicensed and
unauthorised copies of a computer
program or software.
(b) Virus: Viruses are computer programs that can
damage the data and software programs or steal
the information stored on a computer. Major
types of viruses are Worms and Trojan Horse.
• Worms: These are viruses that replicate
Figure 3.33: Worm virus themselves and spread to all files once they
80 Employability Skills – Class X
Practical Exercise
The teacher will facilitate these activities by showing you the
e-learning lesson athttp://www.psscive.ac.in/stud_text_book.
html. This will include videos and e-content for the above topics
as well as detailed instructions for some activities below.
Initial Thinking Activity
After watching the initial video write down the type of risk present
to the data available in different places, for example in a school,
hospital, bank, etc.
Activity 1
Group Chart Making
Material required
Pen, notebook, computer, Chart paper, colours
Procedure
• Form groups depending on the number of students
available. Make a chart to show all the different threats
faced by a computer and how you can protect a computer
from such treats.
B. Subjective questions
Introduction
Entrepreneurship is being talked about a lot in the
world today, and especially in India. Entrepreneurship
is the type of self-employment where one is running a
business to satisfy the needs of people and looking for
ways to make the business better to make profits. This
unit focusses on encouraging students to learn about
entrepreneurship and its functions from the world
around them.
Entrepreneurs are all around us. We would have
spoken to a lot of them through the course of this
module. We also learnt that successful entrepreneurs
have the following qualities.
• They are confident. They believe in themselves
and their abilities.
• They keep trying new ideas in their business.
• They are patient.
• They are creative and think differently about
business ideas.
• They take responsibility for their actions.
• They take decisions after thinking about them.
• They work hard.
• They do not give up when they face a difficulty.
Practical Exercise
Activity 1
Entrepreneurs I know: Individual Practice
Procedure
• In this activity, we will think of the entrepreneurs we know.
Instructions
1. Think of 4 entrepreneurs whom you know or have seen.
2. Draw circles and in each circle write the name of that
entrepreneur, what business they run, and one thing
that you really like about their business.
3. After writing, share the details of the entrepreneurs with
your class.
Help Society
Entrepreneurs have a positive relationship with society.
They make profits through activities that benefit
society. Some entrepreneurs work towards saving the
environment, some give money to build schools and
hospitals. This way, the people and area around them
becomes better.
These are the roles that entrepreneurs do in a society.
How do you think entrepreneurs affect the society they
live in? Let’s read.
Create Jobs
With the growth of a business, entrepreneurs look for
more people to help them. They buy more material, and
from more people. The also hire more people to work for
them. In this way, more people have jobs.
Sharing of Wealth
Wealth means having enough money to live a comfortable
life. As entrepreneurs grow their business, the people
86 Employability Skills – Class X
Entrepreneurial Skills 87
Practical Exercise
Activity 2
Field Work : Let’s be an Entrepreneur
Procedure
• In this activity, students will find problems on their school
campus that can be turned into business opportunities.
Instructions
1. Form groups of 3 each.
2. Take 30 minutes to go around your school.
3. Note down 2–3 problems you see on your school campus.
4. Write down some business ideas to solve these problems
in the table given below.
5. Also think about how your business ideas will help the
school. One example has been written for you.
Problem Business ideas How will this help the
school?
For example, 1. Make plant 1. The school will look
plastic cola pots out of green and beautiful.
bottles bottles and sell The air will be fresh.
from the to students and
canteen are parents. 2. The canteen owner
harming the will spend less money
environment 2. Sell cola in on buying glass
glass bottles. bottles because they
can be used again.
Qualities of an Entrepreneur
Quality is a way in which a person acts or behaves.
Some examples of qualities in people are hardworking,
nice, rude, etc. Read the comic strips in Figures 4.2 and
4.3 and learn about the qualities of an entrepreneur.
Qualities of an Entrepreneur
You must believe in yourself. t
did no
! T h a t idea
You should be CONFIDENT Oh okay. I
r k . B u t, it is
and take business wo NEW
RYING
decision Keep T .
IDEAS
is
y business I had a CREATIVE and different
Running m T IE NT
A
ut, I am P solution to the problem — that is
difficult. B es s
know succ why I am successful!
because I
soon.
will come
Figure 4.2
Entrepreneurial Skills 89
Figure 4.3
Practical Exercise
Activity 1
My Entrepreneurial Qualities: Self-assessment
Procedure
• In this activity, the students will rate themselves on the
entrepreneurial qualities mentioned below.
Y N Y N
I believe in myself and what I can I keep trying new ideas.
do. I am confident.
I think of different ways to
Problems take time to get solved. solve a problem. I am creative.
I am patient about solving them.
I think before I make a decision.
I take responsibility for my
I do not give up when I face a
actions and mistakes.
problem.
I work hard on every task.
Figure A
Activity 2
Let us Solve a Problem!
Procedure
• In this activity, students will try and solve a problem in
their vicinity.
Instructions
1. Similar to Activity 1.2, select a problem in the area
near your home. This could be a problem that really
bothers you.
2. Make a 5-step plan for how you will solve the problem.
3. Implement step 1 of your solution!
4. After that, try implementing all the steps. Try your
solution for a week.
5. At the end of the week, rate yourself again on the
entrepreneurial qualities you rated yourselves on in
Activity 2.2.
Questions for Discussion
Are your ratings on your entrepreneurial qualities before doing
the activity and after doing the activity different?
What qualities did you see yourself apply in the activity? You
would have applied some or all of these entrepreneurial qualities
while implementing your solution. You did not implement a
business solution, but you exercised these qualities anyway.
These are ideal qualities than an entrepreneur has.
However, any individual who is trying to solve a problem can
be entrepreneurial. If employees of a factory or company work
hard to try new ideas to make their company’s products better
or find creative ways to get work done, they are also showing the
qualities of an entrepreneur. They are also being entrepreneurial.
If your mother or father work in a company, ask them if they
show these qualities.
Functions of an Entrepreneur
If you were to become an entrepreneur, you now know how you
would think and act. But, what would you actually be doing in
your business? What work will you do every day? Let’s find out.
Entrepreneurial Skills 91
Figure a
Figure b
Entrepreneurial Skills 93
C. Subjective question
Misconception 1
Practical Exercise
Activity 1
Identifying Everyday Heroes
Procedure
• In this activity, the teacher will make chits about different
professions and the students will act them out. There will
be a discussion after that. The professions are
1. a vegetable seller not using plastic bags
2. a businesswoman running a delivery system
3. a chai wala selling fruit flavoured tea
4. a gold seller selling gold teeth
Instructions
1. There will be professions of different people written on
each chit of paper. The student reads the profession and
acts it out for the class.
2. Identify what each person is doing differently in
their business.
Questions for Discussion
1. Are all these people entrepreneurs? Why or why not?
2. Being a vegetable seller, selling chai or selling gold —
How many of these are new business ideas? How many
of these are common business ideas?
Entrepreneurial Skills 95
Practical Exercise
Activity 2
Talking to Entrepreneurs: Interview
Procedure
• In this activity, students speak to entrepreneurs and learn
about the money needed to start a business and how
to raise money needed for the business. Students should
find out how the entrepreneur raised the money for
their business.
Instructions
1. Identify three different types of successful entrepreneurs
in your area.
2. Ask them how much money they started their
businesses with.
3. What are the sources?
4. How did they raise the money?
5. Caution — not everyone likes talking about money.
Please ask your questions with respect. If someone does
not want to answer, let it be!
6. Fill ‘ Table a’ after the conversation.
Table a
Entrepreneur E.g. Kashish
Name
Type of business Lightbulb shop
Capital ` 50,000 –
` 1,00,000
Misconception 3
Entrepreneurial Skills 97
Practical Exercise
Activity 3
Make and sell
Procedure
• In this activity, students make an item in class and step
out to sell it to someone.
Instructions
1. Form groups of 5 people each. The group should have a
mix of boys and girls.
2. Look into your bags and desks and find any three items.
Put them on your desk.
3. With the materials you’ve collected, make an object.
Take 15 minutes to do this.
4. Now, take 30 minutes to sell it for money, to someone
in school.
Questions for discussion
Were you able to do it? What do you now think — can you be
an entrepreneur?
Story Misconception
Ramu owns a large clothes (a) Every business idea needs
shop. Shamu has a small to be unique or special.
store selling handmade
sarees. Shamu does not call
himself an entrepreneur.
Anna has a great idea for a (b) Entrepreneurs are born,
website. She has ` 5,000. She not made.
is waiting for ` 20,000 more,
so that she can start it.
Session 4: Entrepreneurship as a
Career Option
So far, we have discussed the effect of entrepreneurship
on society the qualities and functions of an
entrepreneur and misconceptions we might have about
entrepreneurship.
In this section, we shall think about entrepreneurship
as a life choice.
Entrepreneurial Skills 99
Practical Exercise
Activity 1
Talking about entrepreneurship as a life option
Procedure
• In this activity, you will compare entrepreneurship and
wage employment.
Instructions
1. Get into pairs.
2. Imagine five years in the future — one person in the
pair is wage employed and the other person is an
entrepreneur. Discuss how your lives are similar and
different from each other.
3. Have a debate with your class and your teacher.
Activity 1
Presenting about the Power of Entrepreneurship
Procedure
• In this activity, students shall prepare and present why
they think entrepreneurship is a good life option for a
person and for the society
Instructions
1. Get into groups of 5 each.
2. Imagine you believe that people should become
entrepreneurs. You are speaking at your school
assembly. You have to talk to the audience about the
power of entrepreneurship. Prepare a presentation for
the same.
3. You can use any way to present - talk, draw, act, sing,
or dance.
4. You have 15 minutes to prepare. You will have 5 minutes
to present.
Things to remember
1. An entrepreneur does a lot of work in his or her
business. One has to learn and practice these actions
before they try it out in their business. This can be
done by either learning them in school and college
or practicing them while working for someone.
2. If you believe in your idea, start your business.
3. Being an entrepreneur can be risky. But if you do
not try, you will not know!
Introduction
The environment around us affects all aspects of our
life; and all our day-to-day activities also affect the
environment. Those who live in cities get their food
supply from surrounding villages and in turn, are
dependent on forests, grasslands, rivers, seashores, for
resources, such as water, fuel wood, fodder, etc. We use
natural resources for food. Everything around us forms
our environment and our lives depend on the natural
world around us.
Over the years, with economic development, there
has been an increase in environmental pollution.
For example, with the introduction of high input
agriculture, we can grow more food by using fertilisers,
pesticides and hybrid crops. But it has led to soil and
environmental degradation. We need to plan the use of
resources in a sustainable manner so that we and our
future generations can enjoy the good environment
Sustainable Processes
Some practices, such as organic farming, vermi-composting
and rainwater harvesting are being used to help preserve
the environment.
Organic farming is where farmers do not use chemical
pesticides and fertilisers to increase their production.
They use organic and natural fertilisers, such as cow
dung to help in growing crops. This helps in better
quality chemical free crops while at the same time
maintaining the soil quality for future use. This is a true
example of sustainable developmen where we are not
only using the earth resources but are also preserving
it for our future generations.
Practical Exercise
Activity 1
Create a Garden in School or Plant Trees
Material required
Seeds, garden waste, sprinkler, gardening tools
Activity 2
Discussion on How to Prevent Wastage.
Procedure
• Form groups depending on the number of students
available.
• Every student in the group will name way in which wasting
of water and food can be stopped or prevented.
• Make a list and share it with the rest of the class.
B. Subjective questions
Quality Education
Education is the most important factors for sustainable
development. Children who have gone to school will be
able to do jobs so that they can take care of themselves
and their families. Education helps us become aware of
our role as a responsible citizen. We should
1. use the facilities present in our areas.
2. take our friends to school.
3. help friends study.
4. stop friends from dropping out of school.
Reduced Inequalities
To reduce inequalities we can
1. be helpful to one another.
2. be friendly with everyone.
3. include everyone while working or playing.
4. help others by including everyone whether they
are small or big, girl or boy, belong to any class
or caste.
Practical Exercise
Activity 1
Group Discussion
Procedure
• Form groups depending on the number of students
available.
• Every student will describe one way in which they can
work to conserve and protect the environment.
• Make a list and share it with the rest of the class.
Activity 2
Make art project using waste
Material required
Plastic bags, used bottles, papers cups, paper, wire, etc.
B. Subjective questions
Life without machines today is unimaginable, and because of this, humans have been putting efforts
into making them even more sophisticated and smart. As a result, we are surrounded by smart devices
and gadgets like smartphones, smartwatches, smart TV, etc. But what makes them smart?
For example, how is a smartphone today different from the telephones we had in the last century?
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Today’s phones can do much more than just call-up
people. They can help us in navigating, recommend
which songs we should listen to or which movies we
should watch according to our own likes and
dislikes. Our phones can help us connect with like-
minded people, make our selfies fun with face
filters, help us maintain a record of our health and
fitness and a lot more. These drastic technological
advancements lead us to recognize one key
concept: the concept of Artificial Intelligence.
What is Artificial Intelligence anyway? Well, the answer lies in the term itself. If we break up this term
up, we get the words “Artificial” and “Intelligence”. Artificial is something which is man-made, which
does not occur naturally. But what about Intelligence, how do we define that?
According to researchers, intelligence is the ‘ability to perceive or infer information, and to retain it as
knowledge to be applied towards adaptive behaviours within an environment or context.’
If we try to define intelligence with the help of its traits, these are the abilities that are involved in
intelligence:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let us define each term mentioned above to get a proper understanding:
Spatial Visual •It is defined as the ability to perceive the visual world and the
Intelligence relationship of one object to another.
Kineasthetic •Ability that is related to how a person uses his limbs in a skilled
Intelligence manilr.
Musical •As the name suggests, this intelligence is about a person's ability to
Intelligence recognize and create sounds, rhythms, and sound patterns.
Intrapersonal •Describes how high the level of self-awareness someone has is.
Intelligence Starting from realizing weakness, strength, to his own feelings.
But even though one is more skilled in intelligence than the other, it should be noted that in fact all humans have all 9 of
these intelligences only at different levels. One might be an expert at painting, while the other might be an expert in
mathematical calculations. One is a musician, the other is an expert dancer.
In other words, we may define intelligence as:
For example, if someone starts talking to us, we know how to keep the conversation going. We can
understand what people mean and can reply in the same way. When we are hungry, we can come up
with various options on what to eat depending upon the food we have at our homes. When we read
something, we are able to understand its meaning and answer anything regarding it.
While understanding the term intelligence, it must be noticed that decision making comprises of a
crucial part of intelligence. Let us delve deeper into it.
Decision Making
You’re trapped. All the doors seem to have started shrinking and only one of them leads you out.
Which door would you pick?
We can’t make “good” decisions without information because then we have to deal with unknown
factors and face uncertainty, which leads us to make wild guesses, flipping coins, or rolling a dice.
Having knowledge, experience, or insights given a certain situation, helps us visualize what the
outcomes could be. and how we can achieve/avoid those outcomes.
Scenario 1
You are locked inside a room with 3 doors to move out of the locked room and you need to find a safe
door to get your way out. Behind the 1st door is a lake with a deadly shark. The 2nd door has a mad
psychopath ready to kill with a weapon and the third one has a lion that has not eaten since the last 2
months.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Which door would you choose? and Why?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
The answer is gate number 3. The reason being that since the lion has not eaten for 2 months, he
wouldn't have survived till now and would already be dead . This makes going out from gate 3 the
correct option.
Scenario 2
Aarti invited four of her friends to her House.. They hadn't seen each other in a long time, so they
chatted all night long and had a good time. In the morning, two of the friends Aarti had invited, died.
The police arrived at the house and found that both the friends were poisoned and that the poison
was in the strawberry pie. The three surviving friends told the police that they hadn't eaten the pie.
The police asked," Why didn’t you eat the pie ?". Shiv said, " I am allergic to strawberries.". Seema
said, " I am on a diet." And Aarti said, "I ate too many strawberries while cooking the pie, I just didn't
want anymore."
The policemen looked at the pictures of the party and immediately identified the murderer.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Look at the picture and identify who is the murderer? Also state why do you think this is the murderer?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
The answer is Seema, can you guess how the police could tell? It’s because she said she is on a diet
and in the picture, she is eating a burger and fries which means she lied.
The above scenarios show that it’s the information which helps humans take good decisions.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
For example, in elementary school, we learn about alphabets and eventually we move ahead to
making words with them. As we grow, we become more and more fluent in the language as we keep
learning new words and use them in our conversations.
Every now and then, we surf the internet for things on Google
without realizing how efficiently Google always responds to us
with accurate answers. Not only does it come up with results
to our search in a matter of seconds, it also suggests and auto-
corrects our typed sentences.
To help us navigate to places, apps like UBER and Google Maps come in haman.
Thus, one no longer needs to stop repeatedly to ask for directions.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
AI has not only made our lives easier but has also been
taking care of our habits, likes, and dislikes. This is why
platforms like Netflix, Amazon, Spotify, YouTube etc.
show us recommendations on the basis of what we
like.
A fully automatic washing machine can work on its own, but it requires human
intervention to select the parameters of washing and to do the necessary preparation for
it to function correctly before each wash, which makes it an example of automation, not
AI.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
An air conditioner can be turned on and off remotely with the help of internet but still
needs a human touch. This is an example of Internet of Things (IoT). Also, every now and
then we get to know about robots which might follow a path or maybe can avoid
obstacles but need to be primed accordingly each time.
We also get to see a lot of projects which can automate our surroundings with the
help of sensors. Here too, since the bot or the automation machine is not trained with
any data, it does not count as AI.
Also, it would be valid to say that not all the devices which are termed as "smart" are AI-enabled. For
example, a TV does not become AI-enabled if it is a smart one, it gets the power of AI when it is able
to think and process on its own.
Just as humans learn how to walk and then improve this skill with the help of their experiences, an AI
machine too gets trained first on the training data and then optimises itself according to its own
experiences which makes AI different from any other technological device/machine.
But well, surely these other technologies too can be integrated with AI to provide the users with a
much better and immersive experience!
Robotics and AI can definitely open the doors to humanoids and self-driving cars, AI when merged
with Internet of things can give rise to cloud computing of data and remote access of AI tools,
automation along with AI can help in achieving voice automated homes and so on. Such integrations
can help us get the best of both worlds!
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Introduction to AI: Basics of AI
As discussed in the last chapter, Artificial Intelligence has always been a term which intrigues people
all over the world. Various organisations have coined their own versions of defining Artificial
Intelligence. Some of them are mentioned below:
European Artificial Intelligence (AI) leadership, the path for an integrated vision
AI is not a well-defined technology and no universally agreed definition exists. It is rather a cover term
for techniques associated with data analysis and pattern recognition. AI is not a new technology,
having existed since the 1950s. While some markets, sectors and individual businesses are more
advanced than others, AI is still at a relatively early stage of development, so that the range of
potential applications, and the quality of most existing applications, have ample margins left for
further development and improvement.
Encyclopaedia Britannica
Artificial intelligence (AI), is the ability of a digital computer or computer-controlled robot to
perform tasks commonly associated with intelligent beings. The term is frequently applied to the
project of developing systems endowed with the intellectual processes characteristic of humans, such
as the ability to reason, discover meaning, generalize, or learn from past experience.
As you can see, Artificial Intelligence is a vast domain. Everyone looks at AI in a different way according
to their mindset. Now, according to your knowledge of AI, start filling the KWLH chart:
K • What I Know?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
In other words, AI can be defined as:
AI theory and development of computer systems (both machines and software) enables machines to
perform tasks that normally require human intelligence.
Artificial Intelligence covers a broad range of domains and applications and is expected to impact every
field in the future. Overall, its core idea is building machines and algorithms which are capable of
performing computational tasks that would otherwise require human like brain functions.
AI, ML & DL
As you have been progressing towards building AI readiness, you must have come across a very
common dilemma between Artificial Intelligence (AI) and Machine Learning (ML). Many times, these
terms are used interchangeably but are they the same? Is there no difference in Machine Learning
and Artificial Intelligence? Is Deep Learning (DL) Also Artificial Intelligence? What exactly is Deep
Learning? Let us see.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Deep Learning (DL)
It enables software to train itself to perform tasks with vast amounts of data. In Deep Learning, the
machine is trained with huge amounts of data which helps it in training itself around the data. Such
machines are intelligent enough to develop algorithms for themselves. Deep Learning is the most
advanced form of Artificial Intelligence out of these three. Then comes Machine Learning which is
intermediately intelligent and Artificial Intelligence covers all the concepts and algorithms which, in
some way or the other mimic human intelligence.
There are a lot of applications of AI out of which few are those which come under ML out of which
very few can be labelled as DL. Therefore, Machine Learning (ML) and Deep Learning (DL) are part of
Artificial Intelligence (AI), but not everything that is Machine learning will be Deep learning.
Introduction to AI Domains
Artificial Intelligence becomes intelligent according to the training which it gets. For training, the
machine is fed with datasets. According to the applications for which the AI algorithm is being
developed, the data which is fed into it changes. With respect to the type of data fed in the AI
model, AI models can be broadly categorised into three domains:
Data Sciences
Data sciences is a domain of AI related to data systems and processes, in which the system collects
numerous data, maintains data sets and derives meaning/sense out of them.
The information extracted through data science can be used to make a decision about it.
Computer Vision
Computer Vision, abbreviated as CV, is a domain of AI that depicts the capability of a machine to get
and analyse visual information and afterwards predict some decisions about it. The entire process
involves image acquiring, screening, analysing, identifying and extracting information. This extensive
processing helps computers to understand any visual content and act on it accordingly. In computer
vision, Input to machines can be photographs, videos and pictures from thermal or infrared sensors,
indicators and different sources.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Computer vision related projects translate digital visual data into descriptions. This data is then turned
into computer-readable language to aid the decision-making process. The main objective of this
domain of AI is to teach machines to collect information from pixels.
The ultimate objective of NLP is to read, decipher, understand, and make sense of the human languages
in a manilr that is valuable.
Email filters
Email filters are one of the most basic and
initial applications of NLP online. It started
out with spam filters, uncovering certain
words or phrases that signal a spam
message.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Smart assistants
Smart assistants like Apple’s Siri and Amazon’s Alexa recognize
patterns in speech, then infer meaning and provide a useful
response.
AI Ethics
Nowadays, we are moving from the Information era to Artificial Intelligence era. Now we do not use
data or information, but the intelligence collected from the data to build solutions. These solutions
can even recommend the next TV show or movies you should watch on Netflix.
We can proudly say that India is leading in the AI usage trends, so we need to keep aspects relating to
ethical practices in mind while developing solutions using AI. Let us understand some of the ethical
concerns in detail.
Scenario 1:
Let us imagine that we are in year 2030. Self-Driving cars which are just a concept in today’s time are
now on roads. People like us are buying them for ease and are using it for our daily transits. Of-course
because of all the features which this car has, it is expensive. Now, let us assume, one day your father
is going to office in his self-driving car. He is sitting in the back seat as the car is driving itself. Suddenly,
a small boy comes in front of this car. The incident was so sudden that the car is only able to make
either of the two choices:
1. Go straight and hit the boy who has come in front of the car and injure him severely.
2. Take a sharp right turn to save the boy and smash the car into a metal pole thus damaging the car
as well as injuring the person sitting in it.
With the help of this scenario, we need to understand that the developer of the car goes through all
such dilemmas while developing the car’s algorithm. Thus, here the morality of the developer gets
transferred into the machine as what according to him/her is right would have a higher priority and
hence would be the selection made by the machine.
If you were in the place of this developer and if there was no other alternative to the situation, which
one of the two would you prioritise and why?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Scenario 2:
Let us now assume that the car has hit the boy who came in front of it. Considering this as an accident,
who should be held responsible for it? Why?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Here, the choices might differ from person to person and one must understand that nobody is wrong
in this case. Every person has a different perspective and hence he/she takes decisions according to
their moralities.
Data Privacy
The world of Artificial Intelligence revolves around Data. Every company whether small or big is mining
data from as many sources as possible. More than 70% of the data collected till now has been collected
in the last 3 years which shows how important data has become in recent times. It is not wrongly said
that Data is the new gold. This makes us think:
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
One of the major sources of data for many major companies is the device which all of us have in our
hands all the time: Smartphones. Smartphones have nowadays become an integral part of our lives.
Most of us use smartphones more than we interact with people around us. Smartphones in today’s
era provide us with a lot of facilities and features which have made our lives easier. Feeling hungry?
Order food online. Want to shop but don’t have time to go out? Go shopping online. From booking
tickets to watching our favourite shows, everything is available in this one small box loaded with
technology.
Another feature of smartphones nowadays is that they provide us with customised recommendations
and notifications according to our choices. Let us understand this with the help of some examples:
1. When you are talking to your friend on a mobile network or on an app like WhatsApp. You tell
your friend that you wish to buy new shoes and are looking for suggestions from him/her. You
discuss about shoes and that is it. After some time, the online shopping websites start giving
you notifications to buy shoes! They start recommending some of their products and urge you
to you buy some.
2. If you search on Google for a trip to Kerala or any other destination, just after the search, all
the apps on your phone which support advertisements, will start sending messages about
packages that you can buy for the trip.
3. Even when you are not using your phone and talking to a person face-to-face about a book
you’ve read recently while the phone is kept in a locked mode nearby, the phone will end up
giving notifications about similar books or messages about the same book once you operate
it.
In all such examples, how does the smartphone get to know about the discussions and thoughts that
you have? Remember whenever you download an app and install it, it asks you for several permissions
to access your phone’s data in different ways. If you do not allow the app these permissions, you
normally cannot access it. And to access the app and make use of it, we sometimes don’t even give it
a thought and allow the app to get all the permissions that it wants. Hence every now and then, the
app has the permission to access various sensors which are there in your smartphone and gather data
about you and your surroundings. We forget that the smartphone which we use is a box full of sensors
which are powered all the time while the phone is switched on.
This leads us to a crucial question: Are we okay with sharing our data with the external world?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
We need to understand that the data which is collected by various applications is ethical as the
smartphone users agree to it (by clicking on allow when it asks for permission and by agreeing to all
the terms and conditions). But at the same time if one does not want to share his/her data with
anyone, he/she can opt for alternative applications which are of similar usage and keep your data
private. For example, an alternative to WhatsApp is the Telegram app which does not collect any data
from us. But since WhatsApp is more popular and used by the crowd, people go for it without thinking
twice.
AI Bias
Another aspect to AI Ethics is bias. Everyone has a bias of their own no matter how much one tries to
be unbiased, we in some way or the other have our own biases even towards smaller things. Biases
are not negative all the time. Sometimes, it is required to have a bias to control a situation and keep
things working.
When we talk about a machine, we know that it is artificial and cannot think on its own. It can have
intelligence, but we cannot expect a machine to have any biases of its own. Any bias can transfer from
the developer to the machine while the algorithm is being developed. Let us look at some of the
examples:
1. Majorly, all the virtual assistants have a female voice. It is only now that some companies have
understood this bias and have started giving options for male voices but since the virtual assistants
came into practice, female voices are always preferred for them over any other voice. Can you think
of some reasons for this?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
2. If you search on Google for salons, the first few searches are mostly for female salons. This is based
on the assumption that if a person is searching fora salon, in all probability it would be a female. Do
you think this is a bias? If yes, then is it a Negative bias or Positive one?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Various other biases are also found in various systems which are not thought up by the machine but
have got transferred from the developer intentionally or unintentionally.
AI Access
Since Artificial Intelligence is still a budding technology, not everyone has the opportunity to access it.
The people who can afford AI enabled devices make the most of it while others who cannot are left
behind. Because of this, a gap has emerged between these two classes of people and it gets widened
with the rapid advancement of technology. Let us understand this with the help of some examples:
AI creates unemployment
AI is making people’s lives easier. Most of the things nowadays are done in just a few clicks. In no time
AI will manage to be able to do all the laborious tasks which we humans have been doing since long.
Maybe in the coming years, AI enabled machines will replace all the people who work as labourers.
This may start an era of mass unemployment where people having little or no skills may be left without
jobs and others who keep up with their skills according to what is required, will flourish.
This brings us to a crossroads. On one hand where AI is advancing and improving the lives of people
by working for them and doing some of their tasks, the other hand points towards the lives of people
who are dependent on laborious jobs and are not skilled to do anything else.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Should AI not replace laborious jobs? Will the lives of people improve if they keep on being unskilled?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Here, we need to understand that to overcome such an issue, one needs to be open to changes. As
technology is advancing with time, humans need to make sure that they are a step ahead and
understand this technology with its pros and cons.
AI for kids
As we all can see, kids nowadays are smart enough to understand technology from a very early age.
As their thinking capabilities increase, they start becoming techno-savvy and eventually they learn
everything more easily than an adult. But should technology be given to children so young?
Consider this: A young boy in class 3 has got some Maths homework to finish. He is sitting at a table
which has the Google chat bot - Alexa on it, and he is struggling with his homework. Soon, he starts
asking Alexa to answer all his questions. Alexa replies with answers and the boy simply writes them
down in his notebook.
While this scenario seems funny, it still has some concerns related to it. On one hand where it is good
that the boy knows how to use technology effectively, on the other hand he uses it to complete his
homework without really learning anything since he is not applying his brain to solve the Math
problems. So, while he is smart, he might not be getting educated properly.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Conclusion
Despite AI’s promises to bring forth new opportunities, there are certain associated risks that need to
be mitigated appropriately and effectively. To give a better perspective, the ecosystem and the socio-
technical environment in which the AI systems are embedded needs to be more trustworthy.
AI Project Cycle
In this chapter, we will revisit the concept of AI Project Cycle.
Introduction
Let us assume that you have to make a greeting card for your mother as it is her birthday. You are very
excited about it and have thought of many ideas to execute the same. Let us look at some of the steps
which you might take to accomplish this task:
1. Look for some cool greeting card ideas from different sources. You might go online and
checkout some videos or you may ask someone who has knowledge about it.
2. After finalising the design, you would make a list of things that are required to make this card.
3. You will check if you have the material with you or not. If not, you could go and get all the
items required, ready for use.
4. Once you have everything with you, you would start making the card.
5. If you make a mistake in the card somewhere which cannot be rectified, you will discard it and
start remaking it.
6. Once the greeting card is made, you would gift it to your mother.
Are these steps relatable?
__________________________________________________________________________________
__________________________________________________________________________________
Do you think your steps might differ? If so, write them down!
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
These steps show how we plan to execute the tasks around us. Consciously or Subconsciously our
mind makes up plans for every task which we have to accomplish which is why things become clearer
in our mind. Similarly, if we have to develop an AI project, the AI Project Cycle provides us with an
appropriate framework which can lead us towards the goal. The AI Project Cycle mainly has 5 stages:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Starting with Problem Scoping, you set the goal for your AI project by stating the problem which you
wish to solve with it. Under problem scoping, we look at various parameters which affect the problem
we wish to solve so that the picture becomes clearer.
To proceed,
● You need to acquire data which will become the base of your project as it will help you in
understanding what the parameters that are related to problem scoping are.
● You go for data acquisition by collecting data from various reliable and authentic sources.
Since the data you collect would be in large quantities, you can try to give it a visual image of
different types of representations like graphs, databases, flow charts, maps, etc. This makes
it easier for you to interpret the patterns which your acquired data follows.
● After exploring the patterns, you can decide upon the type of model you would build to
achieve the goal. For this, you can research online and select various models which give a
suitable output.
● You can test the selected models and figure out which is the most efficient one.
● The most efficient model is now the base of your AI project and you can develop your
algorithm around it.
● Once the modelling is complete, you now need to test your model on some newly fetched
data. The results will help you in evaluating your model and improving it.
● Finally, after evaluation, the project cycle is now complete and what you get is your AI project.
Let us understand each stage of the AI Project Cycle in detail.
Problem Scoping
It is a fact that we are surrounded by problems. They could be small or big, sometimes ignored or
sometimes even critical. Many times, we become so used to a problem that it becomes a part of our
life. Identifying such a problem and having a vision to solve it, is what Problem Scoping is about. A lot
of times we are unable to observe any problem in our surroundings. In that case, we can take a look
at the Sustainable Development Goals. 17 goals have been announced by the United nations which
are termed as the Sustainable Development Goals. The aim is to achieve these goals by the end of
2030. A pledge to do so has been taken by all the member nations of the UN.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here are the 17 SDGs. Let’s take a look:
As you can see, many goals correspond to the problems which we might observe around us too. One
should look for such problems and try to solve them as this would make many lives better and help
our country achieve these goals.
Scoping a problem is not that easy as we need to have a deeper understanding around it so that the
picture becomes clearer while we are working to solve it. Hence, we use the 4Ws Problem Canvas to
help us out.
Who?
The “Who” block helps in analysing the people getting affected directly or indirectly due to it. Under
this, we find out who the ‘Stakeholders’ to this problem are and what we know about them.
Stakeholders are the people who face this problem and would be benefitted with the solution. Here is
the Who Canvas:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What?
Under the “What” block, you need to look into what you have on hand. At this stage, you need to
determine the nature of the problem. What is the problem and how do you know that it is a problem?
Under this block, you also gather evidence to prove that the problem you have selected actually exists.
Newspaper articles, Media, announcements, etc are some examples. Here is the What Canvas:
Where?
Now that you know who is associated with the problem and what the problem actually is; you need
to focus on the context/situation/location of the problem. This block will help you look into the
situation in which the problem arises, the context of it, and the locations where it is prominent. Here
is the Where Canvas:
Why?
You have finally listed down all the major elements that affect the problem directly. Now it is
convenient to understand who the people that would be benefitted by the solution are; what is to be
solved; and where will the solution be deployed. These three canvases now become the base of why
you want to solve this problem. Thus, in the “Why” canvas, think about the benefits which the
stakeholders would get from the solution and how it will benefit them as well as the society.
After filling the 4Ws Problem canvas, you now need to summarise all the cards into one template. The
Problem Statement Template helps us to summarise all the key points into one single Template so
that in future, whenever there is need to look back at the basis of the problem, we can take a look at
the Problem Statement Template and understand the key elements of it.
[stakeholder(s)] Who
Our
Data Acquisition
As we move ahead in the AI Project Cycle, we come across the second element which is : Data
Acquisition. As the term clearly mentions, this stage is about acquiring data for the project. Let us first
understand what is Data. Data can be a piece of information or facts and statistics collected together
for reference or analysis. Whenever we want an AI project to be able to predict an output, we need
to train it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any
employee based on his previous salaries, you would feed the data of his previous salaries into the
machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict
his next salary efficiently. The previous salary data here is known as Training Data while the next salary
prediction data set is known as the Testing Data.
For better efficiency of an AI project, the Training data needs to be relevant and authentic. In the
previous example, if the training data was not of the previous salaries but of his expenses, the machine
would not have predicted his next salary correctly since the whole training went wrong. Similarly, if
the previous salary data was not authentic, that is, it was not correct, then too the prediction could
have gone wrong. Hence….
For any AI project to be efficient, the training data should be authentic and relevant to the problem
statement scoped.
Data Features
Look at your problem statement once again and try to find the data features required to address this
issue. Data features refer to the type of data you want to collect. In our previous example, data
features would be salary amount, increment percentage, increment period, bonus, etc.
After mentioning the Data features, you get to know what sort of data is to be collected. Now, the
question arises- From where can we get this data? There can be various ways in which you can collect
data. Some of them are:
API
Cameras Observations (Application Program
Interface)
Sometimes, you use the internet and try to acquire data for your project from some random websites.
Such data might not be authentic as its accuracy cannot be proved. Due to this, it becomes necessary
to find a reliable source of data from where some authentic information can be taken. At the same
time, we should keep in mind that the data which we collect is open-sourced and not someone’s
property. Extracting private data can be an offence. One of the most reliable and authentic sources of
information, are the open-sourced websites hosted by the government. These government portals
have general information collected in suitable format which can be downloaded and used wisely.
Data Exploration
In the previous modules, you have set the goal of your project and have also found ways to acquire
data. While acquiring data, you must have noticed that the data is a complex entity – it is full of
numbers and if anyone wants to make some sense out of it, they have to work some patterns out of
it. For example, if you go to the library and pick up a random book, you first try to go through its
content quickly by turning pages and by reading the description before borrowing it for yourself,
because it helps you in understanding if the book is appropriate to your needs and interests or not.
Thus, to analyse the data, you need to visualise it in some user-friendly format so that you can:
● Quickly get a sense of the trends, relationships and patterns contained within the data.
● Define strategy for which model to use at a later stage.
● Communicate the same to others effectively. To visualise data, we can use various types of
visual representations.
Visual
Representations
Modelling
In the previous module of Data exploration, we have seen various types of graphical representations
which can be used for representing different parameters of data. The graphical representation makes
the data understandable for humans as we can discover trends and patterns out of it. But when it
comes to machines accessing and analysing data, it needs the data in the most basic form of numbers
(which is binary – 0s and 1s) and when it comes to discovering patterns and trends in data, the machine
goes in for mathematical representations of the same. The ability to mathematically describe the
relationship between parameters is the heart of every AI model. Thus, whenever we talk about
developing AI models, it is the mathematical approach towards analysing data which we refer to.
Machine
Learning
Learning
Based Deep
AI Models
Learning
Rule Based
Supervised Learning
Supervised
In a supervised learning model, the dataset
Learning which is fed to the machine is labelled. In
other words, we can say that the dataset is
Unsupervised known to the person who is training the
Learning machine only then he/she is able to label the
data. A label is some information which can
Reinforcement be used as a tag for data. For example,
students get grades according to the marks
Learning they secure in examinations. These grades
are labels which categorise the students
according to their marks.
There are two types of Supervised Learning models:
Unsupervised Learning
An unsupervised learning model works on unlabelled dataset. This means that the data which is fed
to the machine is random and there is a possibility that the person who is training the model does not
have any information regarding it. The unsupervised learning models are used to identify
relationships, patterns and trends out of the data which is fed into it. It helps the user in understanding
what the data is about and what are the major features identified by the machine in it.
For example, you have a random data of 1000 dog images and you wish to understand some pattern
out of it, you would feed this data into the unsupervised learning model and would train the machine
on it. After training, the machine would come up with patterns which it was able to identify out of it.
The Machine might come up with patterns which are already known to the user like colour or it might
even come up with something very unusual like the size of the dogs.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Unsupervised learning models can be further divided into two categories:
Dimensionality Reduction: We humans are able to visualise upto 3-Dimensions only but according to
a lot of theories and algorithms, there are various entities which exist beyond 3-Dimensions. For
example, in Natural language Processing, the words are considered to be N-Dimensional entities.
Which means that we cannot visualise them as they exist beyond our visualisation ability. Hence, to
make sense out of it, we need to reduce their dimensions. Here, dimensionality reduction algorithm
is used.
As we reduce the dimension of an entity, the information which it contains starts getting distorted.
For example, if we have a ball in our hand, it is 3-Dimensions right now. But if we click its picture, the
data transforms to 2-D as an image is a 2-Dimensional entity. Now, as soon as we reduce one
dimension, at least 50% of the information is lost as now we will not know about the back of the ball.
Whether the ball was of same colour at the back or not? Or was it just a hemisphere? If we reduce the
dimensions further, more and more information will get lost.
Hence, to reduce the dimensions and still be able to make sense out of the data, we use Dimensionality
Reduction.
Evaluation
Once a model has been made and trained, it needs to go through proper testing so that one can
calculate the efficiency and performance of the model. Hence, the model is tested with the help of
Testing Data (which was separated out of the acquired dataset at Data Acquisition stage) and the
efficiency of the model is calculated on the basis of the parameters mentioned below:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Neural Networks
Neural networks are loosely modelled after how neurons in the human brain behave. The key
advantage of neural networks are that they are able to extract data features automatically without
needing the input of the programmer. A neural network is essentially a system of organizing machine
learning algorithms to perform certain tasks. It is a fast and efficient way to solve problems for which
the dataset is very large, such as in images.
As seen in the figure given, the larger Neural Networks tend to perform better with larger amounts of
data whereas the traditional machine learning algorithms stop improving after a certain saturation
point.
This is a representation of how neural networks work. A Neural Network is divided into multiple layers
and each layer is further divided into several blocks called nodes. Each node has its own task to
accomplish which is then passed to the next layer. The first layer of a Neural Network is known as the
input layer. The job of an input layer is to acquire data and feed it to the Neural Network. No
processing occurs at the input layer. Next to it, are the hidden layers. Hidden layers are the layers in
which the whole processing occurs. Their name essentially means that these layers are hidden and are
not visible to the user.
Each node of these hidden layers has its own machine learning algorithm which it executes on the
data received from the input layer. The processed output is then fed to the subsequent hidden layer
* Images shown here are the property of individual organisations and are used here for reference purpose only.
of the network. There can be multiple hidden layers in a neural network system and their number
depends upon the complexity of the function for which the network has been configured. Also, the
number of nodes in each layer can vary accordingly. The last hidden layer passes the final processed
data to the output layer which then gives it to the user as the final output. Similar to the input layer,
output layer too does not process the data which it acquires. It is meant for user-interface.
3. Let us now create a virtual environment named env. To create the environment, write
conda create -n env python=3.7
This code will create an environment named env and will install Python 3.7 and other basic packages
into it.
4. After some processing, the prompt will ask if we wish to proceed with installations or not. Type Y
on it and press Enter. Once we press Enter, the packages will start getting installed in the
environment.
5. Depending upon the internet speed, the downloading of packages might take varied time. The
processing screen will look like this:
6. Once all the packages are downloaded and installed, we will get a message like this:
7. This shows that our environment called env has been successfully created. Once an environment
has been successfully created, we can access it by writing the following:
conda activate env
This would activate the virtual environment and we can see the term written in brackets has changed
form (base) to (env). Now our virtual environment is ready to be used.
But, to open and work with Jupyter Notebooks in this environment, we need to install the packages
which help in working with Jupyter Notebook. These packages get installed by default in the base
environment when Anaconda gets installed.
To install Jupyter Notebook dependencies, we need to activate our virtual environment env and write:
conda install ipykernel nb_conda jupyter
It will again ask if we wish to proceed with the installations, type Y to begin the installations. Once the
installations are complete, we can start working with Jupyter notebooks in this environment.
Artificial intelligence is the trending technology of the future. We can see so many
applications around us. If we as individuals would also like to develop an AI
application, we will need to know a programming language. There are various
Why? programming languages like Lisp, Prolog, C++, Java and Python, which can be
used for developing applications of AI. Out of these, Python gains a maximum
popularity because of the following reasons:
Python has few keywords, simple structure and a clearly defined syntax. Python allows anyone to learn
the language quickly. A program written in Python is fairly easy-to-maintain.
Python has a huge bunch of libraries with plenty of built-in functions to solve a variety of problems.
Interactive Mode
Python has support for an interactive mode which allows interactive testing and debugging of snippets
of code.
Python can run on a wide variety of operating systems and hardware platforms, and has the same
interface on all platforms.
Extendable
We can add low-level modules to the Python interpreter. These modules enable programmers to add
to or customize their tools to be more efficient.
Databases and Scalable
Python provides interfaces to all major open source and commercial databases along with a better
structure and support for much larger programs than shell scripting.
Applications of Python
There exist a wide variety of applications when it comes to Python. Some of the applications are:
1. Printing Statements
We can use Python to display outputs for any code we write. To print any statement, we use print()
function in Python.
Instructions written in the source code to execute are known as statements. These are the lines of
code which we write for the computer to work upon. For example, if we wish to print the addition of
two numbers, say 5 and 10, we would simply write:
print(5+10)
This is a Python statement as the computer would go through it and do the needful (which in this
case would be to calculate 5+10 and print it on the output screen)
On the other hand, there exist some statements which do not get executed by the computer. These
lines of code are skipped by the machine. They are known as comments. Comments are the
statements which are incorporated in the code to give a better understanding of code statements to
the user. To write a comment in Python, one can use # and then write anything after it. For example:
# This is a comment and will not be read by the machine.
print(5+10) # This is a statement and the machine will print the
summation.
Here, we can see that the first line is a comment as it starts with #. In the second line, we have an
executable statement followed by a comment which is written to explain the code. In this way, we can
add comments into our code so that anyone can understand the gist of it.
In Python, there exist some words which are pre-defined and carry a specific meaning for the machine
by default. These words are known as keywords. Keywords cannot be changed at any point in time
and should not be used any other way except the default one, otherwise they create confusion and
might result in ambiguous outputs. Some of the Keywords are mentioned below:
An identifier is any word which is variable. Identifiers can be declared by the user as per their
convenience of use and can vary according to the way the user wants. These words are not defined
and can be used in any way. Keywords cannot be used as identifiers. Some examples of keywords can
be: count, interest, x, ai_learning, Test, etc. Identifiers are also case-sensitive hence an identifier
named as Test would be different from an identifier named test.
A variable is a named location used to store data in the memory. It is helpful to think of variables as a
container that holds data which can be changed later throughout programming. Just like in
Mathematics, in Python too we can use variables to store values in it. The difference here is, that in
Python, the variables not only store numerical values, but can also contain different types of data.
For example:
X = 10 # X variable contains numerical data
Letters = ‘XYZ’ # Letters variable contains alphabetic data
number = 13.95 # number variable contains a decimal value
word = ‘k’ # word variable contains a character
All of these variables contain different types of data in them. The type of data is defined by the term
datatype in Python. There can be various types of data which are used in Python programming. Hence,
the machine identifies the type of variable according to the value which is stored inside it. Various
datatypes in Python can be:
5. Python inputs
In Python, not only can we display the output to the user, but we can also collect data from the user
and can pass it on to the Python script for further processing. To collect the data from the user at the
time of execution, input() function is used. While using the input function, the datatype of the
expected input is required to be mentioned so that the machine does not interpret the received data
in an incorrect manilr as the data taken as input from the user is considered to be a string (sequence
of characters) by default.
For example:
Str = input(<String>) # Python expects the input to be of string
datatype
Number = int(input(<string>)) # Input string gets converted to an
integer value before assignment
Value = float(input(<String>)) # Input string gets converted to a
decimal value before assignment
6. Python Operators
Operators are special symbols which represent computation. They are applied on operand(s), which
can be values or variables. Same operators can behave differently on different data types. Operators
when applied on operands form an expression. Operators are categorized as Arithmetic, Relational,
Logical and Assignment. Value and variables when used with operators are known as operands.
a. Arithmetic Operators
+ Addition 10 + 20 30
- Subtraction 30 - 10 20
/ Division 30 / 10 20.0
// Integer Division 25 // 10 2
% Remainder 25 % 10 5
** Raised to power 3 ** 2 9
b. Conditional Operators
c. Logical Operators
d. Assignment Operators
While coding in Python, a lot of times we need to take decisions. For example, if a person needs to
create a calculator with the help of a Python code, he/she needs to take in 2 numbers from the user
and then ask the user about which function he/she wishes to operate. Now, according to the user’s
choice, the selection of function would change. In this case, we need the machine to understand what
should happen when. This is where conditional statements help. Conditional statements help the
machine in taking a decision according to the condition which gets fulfilled. There exist different types
of conditional statements in Python. Some of them are:
According to the number of conditions and their dependency on each other, the relevant type of
conditional statement is used.
8. Looping
A lot of times, it happens that a task needs to be executed multiple number of times. For example, we
need to print hello 10 times on the output screen. One way of doing this is writing 10 print statements.
But this is time and space consuming. The other way, which is more efficient, is to use loop statements.
The loop statements help in iterating statements or a group of statements as many times as it is asked
for. In this case, we will simply write a loop which would start counting from 1 to 10. At every count,
it will print hello once on the screen and as soon as it reaches 10, the loop will stop executing. All this
can be done by just one loop statement.
Various types of looping mechanisms are available in Python. Some of them are:
While Loop
For Loop Do-While Loop
These were some of the basic concepts for writing a code in Python. We can explore these concepts
further by going through the experiential Jupyter notebook for this chapter. In that notebook, we will
get to explore Python basic concepts and we can also work around them to develop better
understanding around it.
Python Packages
A package is nothing but a space where we can find codes or functions or modules of similar type.
There are various packages readily available to use for free (perks of Python being an open-sourced
language) for various purposes.
To use any package in Python, we need to install it. Installing Python packages is easy. Steps for
package installation are:
1. Open Anaconda Navigator and activate your working environment.
2. Let us assume we wish to install the numpy package. To install this package, simply write:
conda install numpy
3. It will ask us to type Y if we wish to proceed with the installations. As soon as we type Y, the
installations will start and our package will be installed in our selected environment.
4. We can also install multiple packages all at once by mentioning all of them in one line. For
example, if we wish to install numpy, pandas and matplotlib package in our working
environment. For this, simply write:
conda install numpy pandas matplotlib
This code will install these three packages altogether in our environment.
Now, once the packages are installed, we can start using them by importing them in the file where
they are required. As soon as we open our Jupyter Notebook, include the package in the notebook by
writing the import command. Importing a package can be done in various ways:
import numpy
Meaning: Import numpy in the file to use its functionalities in the file to which it has been imported.
import numpy as np
Meaning: Import numpy and refer to it as np wherever it is used.
To develop a better understanding around these packages, let us go through the Jupyter Notebook
of package exploration and see how these packages can be used in Python.
Data Sciences
Introduction
As we have discussed earlier in class 9, Artificial Intelligence is a technology which completely depends
on data. It is the data which is fed into the machine which makes it intelligent. And depending upon
the type of data we have; AI can be classified into three broad domains:
• Data Sciences
Data • Working around numeric and alpha-numeric data.
• Computer Vision
CV • Working around image and visual data.
Each domain has its own type of data which gets fed into the machine and hence has its own way of
working around it. Talking about Data Sciences, it is a concept to unify statistics, data analysis, machine
learning and their related methods in order to understand and analyse actual phenomena with data.
It employs techniques and theories drawn from many fields within the context of Mathematics,
Statistics, Computer Science, and Information Science.
Now before we get into the concepts of Data Sciences, let us experience this domain with the help of
the following game:
Go to this link and try to play the game of Rock, Paper Scissors against an AI model. The challenge here
is to win 20 games against AI before AI wins them against you.
__________________________________________________________________________________
__________________________________________________________________________________
What was the strategy that you applied to win this game against the AI machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was it different playing Rock, Paper & Scissors with an AI machine as compared to a human?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
What approach was the machine following while playing against you?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Internet Search*: When we talk about search engines, we think
‘Google’. Right? But there are many other search engines like
Yahoo, Bing, Ask, AOL, and so on. All these search engines
(including Google) make use of data science algorithms to deliver
the best result for our searched query in the fraction of a second.
Considering the fact that Google processes more than 20 petabytes
of data every day, had there been no data science, Google wouldn’t
have been the ‘Google’ we know today.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
• Predict flight delay
• Decide which class of airplanes to buy
• Whether to directly land at the destination or take a halt in between (For example, A flight
can have a direct route from New Delhi to New York. Alternatively, it can also choose to halt
in any country.)
• Effectively drive customer loyalty programs
Getting Started
Data Sciences is a combination of Python and Mathematical concepts like Statistics, Data Analysis,
probability, etc. Concepts of Data Science can be used in developing applications around AI as it gives
a strong base for data analysis in Python.
Humans are social animals. We tend to organise and/or participate in various kinds of social gatherings
all the time. We love eating out with friends and family because of which we can find restaurants
almost everywhere and out of these, many of the restaurants arrange for buffets to offer a variety of
food items to their customers. Be it small shops or big outlets, every restaurant prepares food in bulk
as they expect a good crowd to come and enjoy their food. But in most cases, after the day ends, a lot
of food is left which becomes unusable for the restaurant as they do not wish to serve stale food to
their customers the next day. So, every day, they prepare food in large quantities keeping in mind the
probable number of customers walking into their outlet. But if the expectations are not met, a good
amount of food gets wasted which eventually becomes a loss for the restaurant as they either have
to dump it or give it to hungry people for free. And if this daily loss is taken into account for a year, it
becomes quite a big amount.
Problem Scoping
Now that we have understood the scenario well, let us take a deeper look into the problem to find out
more about various factors around it. Let us fill up the 4Ws problem canvas to find out.
o Restaurants cook food in bulk every day for their buffets to meet their
What do we
customer needs.
know about
o They estimate the number of customers that would walk into their
them?
restaurant every day.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
What Canvas – What is the nature of their problem?
How do you know o Restaurant Surveys have shown that restaurants face this problem of
it is a problem? food waste.
What would be of key o If the restaurant has a proper estimate of the quantity of food
value to the stakeholders? to be prepared every day, the food waste can be reduced.
Now that we have noted down all the factors around our problem, let us fill up the problem statement
template.
The Problem statement template leads us towards the goal of our project which can now be stated
as:
Quantity of
Total Number Dish
dish prepared
of Customers consumption
per day
Unconsumed Quantity of
dish quantity Price of dish dish for the
per day next day
Now let us understand how these factors are related to our problem statement. For this, we can use
the System Maps tool to figure out the relationship of elements with the project’s goal. Here is the
System map for our problem statement.
In this system map, you can see how the relationship of each element is defined with the goal of our
project. Recall that the positive arrows determine a direct relationship of elements while the negative
ones show an inverse relationship of elements.
After looking at the factors affecting our problem statement, now it’s time to take a look at the data
which is to be acquired for the goal. For this problem, a dataset covering all the elements mentioned
above is made for each dish prepared by the restaurant over a period of 30 days. This data is collected
offline in the form of a regular survey since this is a personalised dataset created just for one
restaurant’s needs.
Specifically, the data collected comes under the following categories: Name of the dish, Price of the
dish, Quantity of dish produced per day, Quantity of dish left unconsumed per day, Total number of
customers per day, Fixed customers per day, etc.
Data Exploration
After creating the database, we now need to look at the data collected and understand what is
required out of it. In this case, since the goal of our project is to be able to predict the quantity of food
to be prepared for the next day, we need to have the following data:
Quantity of
Quantity of that
unconsumed
Name of dish dish prepared per
portion of the dish
day
per day
Thus, we extract the required information from the curated dataset and clean it up in such a way that
there exist no errors or missing elements in it.
Modelling
Once the dataset is ready, we train our model on it. In this case, a regression model is chosen in which
the dataset is fed as a dataframe and is trained accordingly. Regression is a Supervised Learning model
which takes in continuous values of data over a period of time. Since in our case the data which we
have is a continuous data of 30 days, we can use the regression model so that it predicts the next
values to it in a similar manilr. In this case, the dataset of 30 days is divided in a ratio of 2:1 for training
and testing respectively. In this case, the model is first trained on the 20-day data and then gets
evaluated for the rest of the 10 days.
Evaluation
Once the model has been trained on the training dataset of 20 days, it is now time to see if the model
is working properly or not. Let us see how the model works and how is it tested.
Step 1: The trained model is fed data regards the name of the dish and the quantity produced for the
same.
Step 2: It is then fed data regards the quantity of food left unconsumed for the same dish on previous
occasions.
Step 3: The model then works upon the entries according to the training it got at the modelling stage.
Step 4: The Model predicts the quantity of food to be prepared for the next day.
Step 5: The prediction is compared to the testing dataset value. From the testing dataset, ideally, we
can say that the quantity of food to be produced for next day’s consumption should be the total
quantity minus the unconsumed quantity.
Step 6: The model is tested for 10 testing datasets kept aside while training.
Step 8: If the prediction value is same or almost similar to the actual values, the model is said to be
accurate. Otherwise, either the model selection is changed or the model is trained on more data for
better accuracy.
Once the model is able to achieve optimum efficiency, it is ready to be deployed in the restaurant for
real-time usage.
Data Collection
Data collection is nothing new which has come up in our lives. It has been in our society since ages.
Even when people did not have fair knowledge of calculations, records were still maintained in some
way or the other to keep an account of relevant things. Data collection is an exercise which does not
require even a tiny bit of technological knowledge. But when it comes to analysing the data, it
becomes a tedious process for humans as it is all about numbers and alpha-numerical data. That is
where Data Science comes into the picture. It not only gives us a clearer idea around the dataset, but
also adds value to it by providing deeper and clearer analyses around it. And as AI gets incorporated
in the process, predictions and suggestions by the machine become possible on the same.
Now that we have gone through an example of a Data Science based project, we have a bit of clarity
regarding the type of data that can be used to develop a Data Science related project. For the data
domain-based projects, majorly the type of data used is in numerical or alpha-numerical format and
such datasets are curated in the form of tables. Such databases are very commonly found in any
institution for record maintenance and other purposes. Some examples of datasets which you must
already be aware of are:
Now look around you and find out what are the different types of databases which are maintained in
the places mentioned below. Try surveying people who are responsible for the designated places to
get a better idea.
Sources of Data
There exist various sources of data from where we can collect any type of data required and the data
collection process can be categorised in two ways: Offline and Online.
While accessing data from any of the data sources, following points should be kept in mind:
1. Data which is available for public usage only should be taken up.
2. Personal datasets should only be used with the consent of the owner.
3. One should never breach someone’s privacy to collect data.
4. Data should only be taken form reliable sources as the data collected from random sources
can be wrong or unusable.
5. Reliable sources of data ensure the authenticity of data which helps in proper training of the
AI model.
Types of Data
For Data Science, usually the data is collected in the form of tables. These tabular datasets can be
stored in different formats. Some of the commonly used formats are:
1. CSV: CSV stands for comma separated values. It is a simple file format used to store tabular
data. Each line of this file is a data record and reach record consists of one or more fields which
are separated by commas. Since the values of records are separated by a comma, hence they
are known as CSV files.
2. Spreadsheet: A Spreadsheet is a piece of paper or a computer program which is used for
accounting and recording data using rows and columns into which information can be
entered. Microsoft excel is a program which helps in creating spreadsheets.
3. SQL: SQL is a programming language also known as Structured Query Language. It is a domain-
specific language used in programming and is designed for managing data held in different
kinds of DBMS (Database Management System) It is particularly useful in handling structured
data.
A lot of other formats of databases also exist, you can explore them online!
Data Access
After collecting the data, to be able to use it for programming purposes, we should know how to access
the same in a Python code. To make our lives easier, there exist various Python packages which help
us in accessing structured data (in tabular form) inside the code. Let us take a look at some of these
packages:
NumPy
NumPy, which stands for Numerical Python, is the fundamental package for Mathematical and logical
operations on arrays in Python. It is a commonly used package when it comes to working around
numbers. NumPy gives a wide range of arithmetic operations around numbers giving us an easier
approach in working with them. NumPy also works with arrays, which is nothing but a homogenous
collection of Data.
An array is nothing but a set of multiple values which are of same datatype. They can be numbers,
characters, booleans, etc. but only one datatype can be accessed through an array. In NumPy, the
arrays used are known as ND-arrays (N-Dimensional Arrays) as NumPy comes with a feature of
creating n-dimensional arrays in Python.
An array can easily be compared to a list. Let us take a look at how they are different:
Pandas
Pandas is a software library written for the Python programming language for data manipulation and
analysis. In particular, it offers data structures and operations for manipulating numerical tables and
time series. The name is derived from the term "panel data", an econometrics term for data sets that
include observations over multiple time periods for the same individuals.
Here are just a few of the things that pandas does well:
• Easy handling of missing data (represented as NaN) in floating point as well as non-floating
point data
• Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional
objects
• Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or
the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the
data for you in computations
• Intelligent label-based slicing, fancy indexing, and subsetting of large data sets
• Intuitive merging and joining data sets
• Flexible reshaping and pivoting of data sets
Matplotlib*
Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-
platform data visualization library built on NumPy arrays. One of the greatest benefits of visualization
is that it allows us visual access to huge amounts of data in easily digestible visuals. Matplotlib comes
with a wide variety of plots. Plots helps to understand trends, patterns, and to make correlations.
They’re typically instruments for reasoning about quantitative information. Some types of graphs that
we can make with this package are listed below:
Not just plotting, but you can also modify your plots the way you wish. You can stylise them and make
them more descriptive and communicable.
These packages help us in accessing the datasets we have and also in exploring them to develop a
better understanding of them.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Basic Statistics with Python
We have already understood that Data Sciences works around analysing data and performing tasks
around it. For analysing the numeric & alpha-numeric data used for this domain, mathematics comes
to our rescue. Basic statistical methods used in mathematics come quite hAmanin Python too for
analysing and working around such datasets. Statistical tools widely used in Python are:
Do you remember using these formulas in your class? Let us recall all of them here:
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
4. What is Standard Deviation? How is it calculated?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Advantage of using Python packages is that we do not need to make our own formula or equation to
find out the results. There exist a lot of pre-defined functions with packages like NumPy which reduces
this trouble for us. All we need to do is write that function and pass on the data to it. It’s that simple!
Let us take a look at various Python syntaxes that can help us with the statistical work in data analysis.
Head to the Jupyter Notebook of Basic statistics with Python and start exploring! You may find the
Jupyter notebook here: http://bit.ly/data_notebook
Data Visualisation
While collecting data, it is possible that the data might come with some errors. Let us first take a look
at the types of issues we can face with data:
1. Erroneous Data: There are two ways in which the data can be erroneous:
• Incorrect values: The values in the dataset (at random places) are incorrect. For example, in
the column of phone number, there is a decimal value or in the marks column, there is a name
mentioned, etc. These are incorrect values that do not resemble the kind of data expected in
that position.
• Invalid or Null values: At some places, the values get corrupted and hence they become
invalid. Many times you will find NaN values in the dataset. These are null values which do not
hold any meaning and are not processible. That is why, these values (as and when
encountered) are removed from the database.
2. Missing Data: In some datasets, some cells remain empty. The values of these cells are missing and
hence the cells remain empty. Missing data cannot be interpreted as an error as the values here are
not erroneous or might not be missing because of any error.
3. Outliers: Data which does not fall in the range of a certain element are referred to as outliers. To
understand this better, let us take an example of marks of students in a class. Let us assume that a
student was absent for exams and hence has got 0 marks in it. If his marks are taken into account, the
whole class’s average would go down. To prevent this, the average is taken for the range of marks
from highest to lowest keeping this particular result separate. This makes sure that the average marks
of the class are true according to the data.
Analysing the data collected can be difficult as it is all about tables and numbers. While machines work
efficiently on numbers, humans need visual aid to understand and comprehend the information
passed. Hence, data visualisation is used to interpret the data collected and identify patterns and
trends out of it.
In Python, Matplotlib package helps in visualising the data and making some sense out of it. As we
have already discussed before, with the help of this package, we can plot various kinds of graphs. Let
us discuss some of them here:
Scatter plots are used to plot discontinuous data; that is, the data
which does not have any continuity in flow is termed as
Scatter Plot discontinuous. There exist gaps in data which introduce discontinuity.
A 2D scatter plot can display information maximum upto 4
parameters.
In this scatter plot, 2 axes (X and Y) are two different parameters. The colour of circles and the size
both represent 2 different parameters. Thus, just through one coordinate on the graph, one can
visualise 4 different parameters all at once.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
It is one of the most commonly used graphical methods. From
students to scientists, everyone uses bar charts in some way or the
Bar Chart other. It is a very easy to draw yet informative graphical
representation. Various versions of bar chart exist like single bar
chart, double bar chart, etc.
This is an example of a double bar chart. The 2 axes depict two different parameters while bars of
different colours work with different entities ( in this case it is women and men). Bar chart also works
on discontinuous data and is made at uniform intervals.
Histogram When it comes to plotting the variation in just one entity of a period
of time, histograms come into the picture. It represents the frequency
of the variable at different points of time with the help of the bins.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
In the given example, the histogram is showing the variation in frequency of the entity plotted with
the help of XY plane. Here, at the left, the frequency of the element has been plotted and it is a
frequency map for the same. The colours show the transition from low to high and vice versa. Whereas
on the right, a continuous dataset has been plotted which might not be talking about the frequency
of occurrence of the element.
Box Plots range, box plots come in haman. Box plots also known as box and
whiskers plot conveniently display the distribution of data throughout
the range with the help of 4 quartiles.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here as we can see, the plot contains a box and two lines at its left and right are termed as whiskers.
The plot has 5 different parts to it:
Quartile 1: From 0 percentile to 25th percentile – Here data lying between 0 and 25th percentile is
plotted. Now, if the data is close to each other, lets say 0 to 25th percentile data has been covered in
just 20-30 marks range, then the whisker would be smaller as the range is smaller. But if the range is
large that is 0-30 marks range, then the whisker would also get elongated as the range is longer.
Quartile 2: From 25th Percentile to 50th percentile – 50th percentile is termed as the mean of the whole
distribution and since the data falling in the range of 25th percentile to 75th percentile has minimum
deviation from the mean, it is plotted inside the box.
Quartile 3: From 50th percentile to 75th percentile – This range is again plotted in the box as its
deviation from the mean is less. Quartile 2 & 3 (from 25th percentile to 75th percentile) together
constitute the Inter Quartile Range (IQR). Also, depending upon the range of distribution, just like
whiskers, the length of box also varies if the data is less spread or more.
Quartile 4: From 75th percentile to 100th percentile – It is the whiskers plot for top 25 percentile data.
Outliers: The advantage of box plots is that they clearly show the outliers in a data distribution. Points
which do not lie in the range are plotted outside the graph as dots or circles and are termed as outliers
as they do not belong to the range of data. Since being out of range is not an error, that is why they
are still plotted on the graph for visualisation.
Let us now move ahead and experience data visualisation using Jupyter notebook. Matplotlib library
will help us in plotting all sorts of graphs while Numpy and Pandas will help us in analysing the data.
Personality Prediction
Step 1: Here is a map. Take a good look at it. In this map you can see the arrows determine a quality.
The qualities mentioned are:
1. Positive X-axis – People focussed: You focus more on people and try to deliver the best
experience to them.
2. Negative X-axis – Task focussed: You focus more on the task which is to be accomplished and
try to do your best to achieve that.
3. Positive Y-axis – Passive: You focus more on listening to people and understanding everything
that they say without interruption.
4. Negative Y-axis – Active: You actively participate in the discussions and make sure that you
make your point in-front of the crowd.
Think for a minute and understand which of these qualities you have in you. Now, take a chit and write
your name on it. Place this chit at a point in this map which best describes you. It can be placed
anywhere on the graph. Be honest about yourself and put it on the graph.
Step 2: Now that you have all put up your chits on the graph, it’s time to take a quick quiz. Go to this
link and finish the quiz on it individually: https://tinyurl.com/discanimal
On this link, you will find a personality prediction quiz. Take this quiz individually and try to answer all
the questions honestly. Do not take anyone’s help in it and do not discuss about it with anyone. Once
the quiz is finished, remember the animal which has been predicted for you. Write it somewhere and
do not show it to anyone. Keep it as your little secret.
Once everyone has gone through the quiz, go back to the board remove your chit, and draw the
symbol which corresponds to your animal in place of your chit. Here are the symbols:
⚫ ☺
Place these symbols at the locations where you had put up your names. Ask 4 students not to do so
and tell them to keep their animals a secret. Let their name chits be on the graph so that we can
predict their animals with the help of this map.
Now, we will try to use the nearest neighbour algorithm here and try to predict what can be the
possible animal(s) for these 4 unknowns. Now look that these 4 chits one by one. Which animal is
occurring the most in their vicinity? Do you think that if the m lion symbol is occurring the most near
their chit, then there is a good probability that their animal would also be a lion? Now let us try to
guess the animal for all 4 of them according to their nearest neighbours respectively. After guessing
the animals, ask these 4 students if the guess is right or not.
• The KNN prediction model relies on the surrounding points or neighbours to determine its
class or group
• Utilises the properties of the majority of the nearest points to decide how to classify unknown
points
• Based on the concept that similar data points should be close to each other
The personality prediction activity was a brief introduction to KNN. As you recall, in that activity, we
tried to predict the animal for 4 students according to the animals which were the nearest to their
points. This is how in a lay-man’s language KNN works. Here, K is a variable which tells us about the
number of neighbours which are taken into account during prediction. It can be any integer value
starting from 1.
Let us look at another example to demystify this algorithm. Let us assume that we need to predict the
sweetness of a fruit according to the data which we have for the same type of fruit. So here we have
three maps to predict the same:
Here, X is the value which is to be predicted. The green dots depict sweet values and the blue ones
denote not sweet.
Let us try it out by ourselves first. Look at the map closely and decide whether X should be sweet or
not sweet?
Here, we can see that K is taken as 1 which means that we are taking only 1 nearest
1 neighbour into consideration. The nearest value to X is a blue one hence 1-nearest
neighbour algorithm predicts that the fruit is not sweet.
In the 2nd graph, the value of K is 2. Taking 2 nearest nodes to X into consideration, we
see that one is sweet while the other one is not sweet. This makes it difficult for the
2 machine to make any predictions based on the nearest neighbour and hence the
machine is not able to give any prediction.
In the 3rd graph, the value of K becomes 3. Here, 3 nearest nodes to X are chosen out
3 of which 2 are green and 1 is blue. On the basis of this, the model is able to predict that
the fruit is sweet.
KNN tries to predict an unknown value on the basis of the known values. The model simply calculates
the distance between all the known points with the unknown point (by distance we mean to say the
different between two values) and takes up K number of points whose distance is minimum. And
according to it, the predictions are made.
1. As we decrease the value of K to 1, our predictions become less stable. Just think for a minute,
imagine K=1 and we have X surrounded by several greens and one blue, but the blue is the
single nearest neighbour. Reasonably, we would think X is most likely green, but because K=1,
KNN incorrectly predicts that it is blue.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
2. Inversely, as we increase the value of K, our predictions become more stable due to majority
voting / averaging, and thus, more likely to make more accurate predictions (up to a certain
point). Eventually, we begin to witness an increasing number of errors. It is at this point we
know we have pushed the value of K too far.
3. In cases where we are taking a majority vote (e.g. picking the mode in a classification problem)
among labels, we usually make K an odd number to have a tiebreaker.
Computer Vision
Introduction
In the previous chapter, you studied the concepts of Artificial Intelligence for Data Sciences. It is a
concept to unify statistics, data analysis, machine learning and their related methods in order to
understand and analyse actual phenomena with data.
As we all know, artificial intelligence is a technique that enables computers to mimic human
intelligence. As humans we can see things, analyse it and then do the required action on the basis of
what we see.
But can machines do the same? Can machines have the eyes that humans have? If you answered Yes,
then you are absolutely right. The Computer Vision domain of Artificial Intelligence, enables machines
to see through images or visual data, process and analyse them on the basis of algorithms and
methods in order to analyse actual phenomena with images.
Now before we get into the concepts of Computer Vision, let us experience this domain with the help
of the following game:
Go to the link and try to play the game of Emoji Scavenger Hunt. The challenge here is to find 8 items
within the time limit to pass.
__________________________________________________________________________________
__________________________________________________________________________________
What was the strategy that you applied to win this game?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Was the computer able to identify all the items you brought in front of it?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Did the lighting of the room affect the identifying of items by the machine?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Facial Recognition*: With the advent of smart cities and smart homes,
Computer Vision plays a vital role in making the home smarter. Security
being the most important application involves use of Computer Vision
for facial recognition. It can be either guest recognition or log
maintenance of the visitors.
Face Filters*: The modern-day apps like Instagram and snapchat have
a lot of features based on the usage of computer vision. The
application of face filters is one among them. Through the camera the
machine or the algorithm is able to identify the facial dynamics of the
person and applies the facial filter selected.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Computer Vision in Retail*: The retail field has been one of the
fastest growing field and at the same time is using Computer
Vision for making the user experience more fruitful. Retailers can
use Computer Vision techniques to track customers’ movements
through stores, analyse navigational routes and detect walking
patterns.
Inventory Management is another such application. Through
security camera image analysis, a Computer Vision algorithm can
generate a very accurate estimate of the items available in the
store. Also, it can analyse the use of shelf space to identify
suboptimal configurations and suggest better item placement.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Computer Vision: Getting Started
Computer Vision is a domain of Artificial Intelligence, that deals with the images. It involves the
concepts of image processing and machine learning models to build a Computer Vision based
application.
Object
Classification
Detection
Classification + Instance
Localisation Segementation
Classification
Image Classification problem is the task of assigning an input image one label from a fixed set of
categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of
practical applications.
Classification + Localisation
This is the task which involves both processes of identifying what object is present in the image and
at the same time identifying at what location that object is present in that image. It is used only for
single objects.
Object Detection
Object detection is the process of finding instances of real-world objects such as faces, bicycles, and
buildings in images or videos. Object detection algorithms typically use extracted features and
learning algorithms to recognize instances of an object category. It is commonly used in applications
such as image retrieval and automated vehicle parking systems.
Instance Segmentation
Instance Segmentation is the process of detecting instances of the objects, giving them a category and
then giving each pixel a label on the basis of that. A segmentation algorithm takes an image as input
and outputs a collection of regions (or segments).
Basics of Images
We all see a lot of images around us and use them daily either through our mobile phones or computer
system. But do we ask some basic questions to ourselves while we use them on such a regular basis.
Don’t know the answer yet? Don’t worry, in this section we will study about the basics of an image:
Basics of Pixels
The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels.
They are the smallest unit of information that make up a picture. Usually round or square, they are
typically arranged in a 2-dimensional grid.
In the image below, one portion has been magnified many times over so that you can see its individual
composition in pixels. As you can see, the pixels approximate the actual image. The more pixels you
have, the more closely the image resembles the original.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Resolution
The number of pixels in an image is sometimes called the resolution. When the term is used to describe
pixel count, one convention is to express resolution as the width by the height, for example a monitor
resolution of 1280×1024. This means there are 1280 pixels from one side to the other, and 1024 from
top to bottom.
Another convention is to express the number of pixels as a single number, like a 5 mega pixel camera
(a megapixel is a million pixels). This means the pixels along the width multiplied by the pixels along
the height of the image taken by the camera equals 5 million pixels. In the case of our 1280×1024
monitors, it could also be expressed as 1280 x 1024 = 1,310,720, or 1.31 megapixels.
Pixel value
Each of the pixels that represents an image stored inside a computer has a pixel value which describes
how bright that pixel is, and/or what colour it should be. The most common pixel format is the byte
image, where this number is stored as an 8-bit integer giving a range of possible values from 0 to 255.
Typically, zero is to be taken as no colour or black and 255 is taken to be full colour or white.
Why do we have a value of 255 ? In the computer systems, computer data is in the form of ones and
zeros, which we call the binary system. Each bit in a computer system can have either a zero or a one.
Since each pixel uses 1 byte of an image, which is equivalent to 8 bits of data. Since each bit can have
two possible values which tells us that the 8 bit can have 255 possibilities of values which starts from
0 and ends at 255.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Grayscale Images
Grayscale images are images which have a range of shades of gray without apparent colour. The
darkest possible shade is black, which is the total absence of colour or zero value of pixel. The lightest
possible shade is white, which is the total presence of colour or 255 value of a pixel . Intermediate
shades of gray are represented by equal brightness levels of the three primary colours.
A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels. The size of a
grayscale image is defined as the Height x Width of that image.
Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0-
255.The computers store the images we see in the form of these numbers.
RGB Images
All the images that we see around are coloured images. These images are made up of three primary
colours Red, Green and Blue. All the colours that are present can be made by combining different
intensities of red, green and blue.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let us experience!
___________________________________________________________________________
___________________________________________________________________________
3) How does the colour vary when you put either of the three as 0 and then keep on varying
the other two?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
4) How does the output colour change when all the three colours are varied in same
proportion ?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
5) What is the RGB value of your favourite colour from the colour palette?
___________________________________________________________________________
Were you able to answer all the questions? If yes, then you would have understood how every colour
we see around is made.
Now the question arises, how do computers store RGB images? Every RGB image is stored in the form
of three different channels called the R channel, G channel and the B channel.
Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three
planes when combined together form a colour image. This means that in a RGB image, each pixel has
a set of three different values which together give colour to that particular pixel.
For Example,
As you can see, each colour image is stored in the form of three different channels, each having
different intensity. All three channels combine together to form a colour we see.
In the above given image, if we split the image into three different channels, namely Red (R), Green
(G) and Blue (B), the individual layers will have the following intensity of colours of the individual
pixels. These individual layers when stored in the memory looks like the image on the extreme right.
The images look in the grayscale image because each pixel has a value intensity of 0 to 255 and as
studied earlier, 0 is considered as black or no presence of colour and 255 means white or full presence
of colour. These three individual RGB values when combined together form the colour of each pixel.
Therefore, each pixel in the RGB image has three values to form the complete colour.
Task :
Go to the following link www.piskelapp.com and create your own pixel art. Try and make a GIF using
the online app for your own pixel art.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Image Features
In computer vision and image processing, a feature is a piece of information which is relevant for
solving the computational task related to a certain application. Features may be specific structures in
the image such as points, edges or objects.
For example:
Imagine that your security camera is capturing an image. At the top of the image we are given six small
patches of images. Our task is to find the exact location of those image patches in the image.
Take a pencil and mark the exact location of those patches in the image.
Were you able to find the exact location of all the patches?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Let’s Reflect:
Let us take individual patches into account at once and then check the exact location of those patches.
For Patch A and B: The patch A and B are flat surfaces in the image and are spread over a lot of area.
They can be present at any location in a given area in the image.
For Patch C and D: The patches C and D are simpler as compared to A and B. They are edges of a
building and we can find an approximate location of these patches but finding the exact location is
still difficult. This is because the pattern is the same everywhere along the edge.
For Patch E and F: The patches E and F are the easiest to find in the image. The reason being that E
and F are some corners of the building. This is because at the corners, wherever we move this patch
it will look different.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Conclusion
In image processing, we can get a lot of features from the image. It can be either a blob, an edge or a
corner. These features help us to perform various tasks and then get the analysis done on the basis of
the application. Now the question that arises is which of the following are good features to be used?
As you saw in the previous activity, the features having the corners are easy to find as they can be
found only at a particular location in the image, whereas the edges which are spread over a line or an
edge look the same all along. This tells us that the corners are always good features to extract from
an image followed by the edges.
Let’s look at another example to understand this. Consider the images given below and apply the
concept of good features for the following.
In the above image how would we determine the exact location of each patch?
The blue patch is a flat area and difficult to find and track. Wherever you move the blue patch it looks
the same. The black patch has an edge. Moved along the edge (parallel to edge), it looks the same.
The red patch is a corner. Wherever you move the patch, it looks different, therefore it is unique.
Hence, corners are considered to be good features in an image.
Introduction to OpenCV
Now that we have learnt about image features and its importance in image processing, we will learn
about a tool we can use to extract these features from our image for further processing.
OpenCV or Open Source Computer Vision Library is that tool which helps a computer extract these
features from the images. It is used for all kinds of images and video processing and analysis. It is
capable of processing images and videos to identify objects, faces, or even handwriting.
In this chapter we will use OpenCV for basic image processing operations on
images such as resizing, cropping and many more.
To install OpenCV library, open anaconda prompt and then write the following
command:
Now let us take a deep dive on the various functions of OpenCV to understand the various image
processing techniques. Head to Jupyter Notebook for introduction to OpenCV given on this link:
http://bit.ly/cv_notebook
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Convolution
We have learnt that computers store images in numbers, and that pixels are arranged in a particular
manner to create the picture we can recognize. These pixels have value varying from 0 to 255 and the
value of the pixel determines the color of that pixel.
But what if we edit these numbers, will it bring a change to the image? The answer is yes. As we change
the values of these pixels, the image changes. This process of changing pixel values is the base of image
editing.
We all use a lot of image editing software like photoshop and at the same time use apps like Instagram
and snapchat, which apply filters to the image to enhance the quality of that image.
As you can see, different filters applied to an image change the pixel values evenly throughout the
image. How does this happen? This is done with the help of the process of convolution and the
convolution operator which is commonly used to create these effects.
Before we understand how the convolution operation works, let us try and create a theory for the
convolution operator by experiencing it using an online application.
Task
Go to the link http://matlabtricks.com/post-5/3x3-convolution-kernels-with-online-demo and at the
bottom of the page click on load “Click to Load Application”
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Once the application is loaded try different filters and apply it on the image. Observe how the value
of the kernel is changing for different filters. Try these steps
Let us follow the following steps to understand how a convolution operator works. The steps to be
followed are:
What theory do you propose for convolution on the basis of the observation?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
It is time to test the theory. Change the location of the four numbers and follow the above mentioned
steps. Does your theory hold true?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
If yes, change the picture and try whether the theory holds true or not. If it does not hold true, modify
your theory and keep trying until it satisfies all the conditions.
Let’s Discuss
What effect did you apply?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Convolution : Explained
Convolution is a simple Mathematical operation which is fundamental to many
common image processing operators. Convolution provides a way of `multiplying together' two arrays
of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of
numbers of the same dimensionality.
An (image) convolution is simply an element-wise multiplication of image arrays and another array
called the kernel followed by sum.
As you can see here,
I = Image Array
K = Kernel Array
Note: The Kernel is passed over the whole image to get the resulting array after convolution.
What is a Kernel?
A Kernel is a matrix, which is slid across the image and multiplied with the input such that the output
is enhanced in a certain desirable manner. Each kernel has a different value for different kind of effects
that we want to apply to an image.
In Image processing, we use the convolution operation to extract the features from the images which
can le later used for further processing especially in Convolution Neural Network (CNN), about which
we will study later in the chapter.
In this process, we overlap the centre of the image with the centre of the kernel to obtain the
convolution output. In the process of doing it, the output image becomes smaller as the overlapping
is done at the edge row and column of the image. What if we want the output image to be of exact
size of the input image, how can we achieve this?
To achieve this, we need to extend the edge values out by one in the original image while overlapping
the centres and performing the convolution. This will help us keep the input and output image of the
same size. While extending the edges, the pixel values are considered as zero.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s try
In this section we will try performing the convolution operator on paper to understand how it works.
Fill the blank places of the output images by performing the convolution operation.
155 146 13 20 0 12 45 0
100 175 0 25 25 15 0 0
-1 0 -1
120 156 255 0 78 56 23 0 0 -1 0
-1 0 -1
115 113 25 90 0 80 56 155
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Write Your Output Here :
Summary
1. Convolution is a common tool used for image editing.
2. It is an element wise multiplication of an image and a kernel to get the desired output.
3. In computer vision application, it is used in Convolutional Neural Network (CNN) to extract
image features.
Let’s recall
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Fill in the names of different layers of Neural Network.
Did you get the answers right? In this section, we are going to study about one such neural network
which is Convolutional Neural Network (CNN). Many of the current computer vision applications use
a powerful neural network called the convolutional neural network.
A Convolutional Neural Network (CNN) is a Deep Learning algorithm which can take in an input image,
assign importance (learnable weights and biases) to various aspects/objects in the image and be able
to differentiate one from the other.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
In the above diagram, we give an input image, which is then processed through a CNN and then gives
prediction on the basis of the label given in the particular dataset.
1) Convolution Layer
2) Rectified linear Unit (ReLU)
3) Pooling Layer
4) Fully Connected Layer
Convolution Layer
It is the first layer of a CNN. The objective of the Convolution Operation is to extract the high-level
features such as edges, from the input image. CNN need not be limited to only one Convolutional
Layer. Conventionally, the first Convolution Layer is responsible for capturing the Low-Level features
such as edges, colour, gradient orientation, etc. With added layers, the architecture adapts to the
High-Level features as well, giving us a network which has the wholesome understanding of images in
the dataset.
It uses convolution operation on the images. In the convolution layer, there are several kernels that
are used to produce several features. The output of this layer is called the feature map. A feature map
is also called the activation map. We can use these terms interchangeably.
There’s several uses we derive from the feature map:
• We reduce the image size so that it can be processed more efficiently.
• We only focus on the features of the image that can help us in processing the image further.
For example, you might only need to recognize someone’s eyes, nose and mouth to recognize the
person. You might not need to see the whole face.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Rectified Linear Unit Function
The next layer in the Convolution Neural Network is the Rectified Linear Unit function or the ReLU
layer. After we get the feature map, it is then passed onto the ReLU layer. This layer simply gets rid of
all the negative numbers in the feature map and lets the positive number stay as it is.
The process of passing it to the ReLU layer introduces non – linearity in the feature map. Let us see it
through a graph.
If we see the two graphs side by side, the one on the left is a linear graph. This graph when passed
through the ReLU layer, gives the one on the right. The ReLU graph starts with a horizontal straight
line and then increases linearly as it reaches a positive number.
Now the question arises, why do we pass the feature map to the ReLU layer? it is to make the colour
change more obvious and more abrupt?
* Images shown here are the property of individual organisations and are used here for reference purpose only.
As shown in the above convolved image, there is a smooth grey gradient change from black to white.
After applying the ReLu function, we can see a more abrupt change in color which makes the edges
more obvious which acts as a better feature for the further layers in a CNN as it enhances the
activation layer.
Pooling Layer
Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the spatial size of the
Convolved Feature while still retaining the important features.
1) Max Pooling : Max Pooling returns the maximum value from the portion of the image covered
by the Kernel.
2) Average Pooling: Max Pooling returns the maximum value from the portion of the image
covered by the Kernel.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
The pooling layer is an important layer in the CNN as it performs a series of tasks which are as
follows :
A small difference in input image will create very similar pooled image.
The final layer in the CNN is the Fully Connected Layer (FCP). The objective of a fully connected layer
is to take the results of the convolution/pooling process and use them to classify the image into a label
(in a simple classification example).
The output of convolution/pooling is flattened into a single vector of values, each representing a
probability that a certain feature belongs to a label. For example, if the image is of a cat, features
representing things like whiskers or fur should have high probabilities for the label “cat”.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s Summarize:
Write the whole process of how a CNN works on the basis of the above diagram.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Let’s Experience
Now let us see how this comes into practice. To see that, go to the link
http://scs.ryerson.ca/~aharley/vis/conv/flat.html
This is an online application of classifying different numbers. We need to analyse the different layers
in the application on the basis of the CNN that we have studied in the previous section.
Natural Language Processing
Introduction
Till now, we have explored two domains of AI: Data Science and Computer Vision. Both these domains
differ from each other in terms of the data on which they work. Data Science works around numbers
and tabular data while Computer Vision is all about visual data like images and videos. The third
domain, Natural Language Processing (commonly called NLP) takes in the data of Natural Languages
which humans use in their daily lives and operates on this.
Natural Language Processing, or NLP, is the sub-field of AI that is focused on enabling computers to
understand and process human languages. AI is a subfield of Linguistics, Computer Science,
Information Engineering, and Artificial Intelligence concerned with the interactions between
computers and human (natural) languages, in particular how to program computers to process and
analyse large amounts of natural language data.
But how do computers do that? How do they understand what we say in our language? This chapter
is all about demystifying the Natural Language Processing domain and understanding how it works.
Before we get deeper into NLP, let us experience it with the help of this AI Game:
Go to this link on Google Chrome, launch the experiment and try to identify the Mystery Animal by
asking the machine 20 Yes or No questions.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
If no, how many times did you try playing this game?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Were there any challenges that you faced while playing this game? If yes, list them down.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
* Images shown here are the property of individual organisations and are used here for reference purpose only.
indicators of their reputation. Beyond determining simple polarity, sentiment analysis understands
sentiment in context to help better understand what’s behind an expressed opinion, which can be
extremely relevant in understanding and driving purchasing decisions.
The Scenario
The world is competitive nowadays. People face
competition in even the tiniest tasks and are expected to
give their best at every point in time. When people are
unable to meet these expectations, they get stressed and
could even go into depression. We get to hear a lot of cases
where people are depressed due to reasons like peer
pressure, studies, family issues, relationships, etc. and they
eventually get into something that is bad for them as well
as for others. So, to overcome this, cognitive behavioural
therapy (CBT) is considered to be one of the best methods
to address stress as it is easy to implement on people and
also gives good results. This therapy includes
* Images shown here are the property of individual organisations and are used here for reference purpose only.
understanding the behaviour and mindset of a person in their normal life. With the help of CBT,
therapists help people overcome their stress and live a happy life.
To understand more about the concept of this therapy, visit this link:
https://en.wikipedia.org/wiki/Cognitive_behavioral_therapy
Problem Scoping
CBT is a technique used by most therapists to cure patients out of stress and depression. But it has
been observed that people do not wish to seek the help of a psychiatrist willingly. They try to avoid
such interactions as much as possible. Thus, there is a need to bridge the gap between a person who
needs help and the psychiatrist. Let us look at various factors around this problem through the 4Ws
problem canvas.
What do we know
o People who are going through stress are reluctant to consult a psychiatrist.
about them?
What is the o People who need help are reluctant to consult a psychiatrist and hence live
problem? miserably.
How do you know o Studies around mental stress and depression available on various authentic
it is a problem? sources.
o People get a platform where they can talk and vent out their
What would be of key feelings anonymously
value to the stakeholders? o People get a medium that can interact with them and applies
primitive CBT on them and can suggest help whenever needed
How would it improve their o People would be able to vent out their stress
situation? o They would consider going to a psychiatrist whenever required
Now that we have gone through all the factors around the problem, the problem statement templates
go as follows:
“To create a chatbot which can interact with people, help them
to vent out their feelings and take them through primitive CBT.”
Data Acquisition
To understand the sentiments of people, we need to collect their conversational data so the machine
can interpret the words that they use and understand their meaning. Such data can be collected from
various means:
Modelling
Once the text has been normalised, it is then fed to an NLP based AI model. Note that in NLP, modelling
requires data pre-processing only after which the data is fed to the machine. Depending upon the type
of chatbot we try to make, there are a lot of AI models available which help us build the foundation of
our project.
Evaluation
The model trained is then evaluated and the accuracy for the same is generated on the basis of the
relevance of the answers which the machine gives to the user’s responses. To understand the
efficiency of the model, the suggested answers by the chatbot are compared to the actual answers.
As you can see in the above diagram, the blue line talks about the model’s output while the green one
is the actual output along with the data samples.
The model’s output does not match the true function at all. Hence the model is said
Figure 1 to be underfitting and its accuracy is lower.
In the second one, the model’s performance matches well with the true function
Figure 2 which states that the model has optimum accuracy and the model is called a
perfect fit.
In the third case, model performance is trying to cover all the data samples even if
Figure 3 they are out of alignment to the true function. This model is said to be overfitting
and this too has a lower accuracy.
Once the model is evaluated thoroughly, it is then deployed in the form of an app which people can
use easily.
Chatbots
As we have seen earlier, one of the most common applications of Natural Language Processing is a
chatbot. There are a lot of chatbots available and many of them use the same approach as we used in
the scenario above.. Let us try some of the chatbots and see how they work.
• Mitsuku Bot*
https://www.pandorabots.com/mitsuku/
• CleverBot*
https://www.cleverbot.com/
• Jabberwacky*
http://www.jabberwacky.com/
• Haptik*
https://haptik.ai/contact-us
* Images shown here are the property of individual organisations and are used here for reference purpose only.
• Rose*
http://ec2-54-215-197-164.us-west-1.compute.amazonaws.com/speech.php
• Ochatbot*
https://www.ometrics.com/blog/list-of-fun-chatbots/
Let us discuss!
• Which chatbot did you try? Name any one.
• What is the purpose of this chatbot?
• How was the interaction with the chatbot?
• Did the chat feel like talking to a human or a robot? Why do you think so?
• Do you feel that the chatbot has a certain personality?
As you interact with more and more chatbots, you would realise that some of them are scripted or in
other words are traditional chatbots while others were AI-powered and had more knowledge. With
the help of this experience, we can understand that there are 2 types of chatbots around us: Script-
bot and Smart-bot. Let us understand what each of them mean in detail:
Script-bot Smart-bot
Script bots are easy to make Smart-bots are flexible and powerful
Script bots work around a script which is Smart bots work on bigger databases and other
programmed in them resources directly
Mostly they are free and are easy to integrate Smart bots learn with more data
to a messaging platform
No or little language processing skills Coding is required to take this up on board
Limited functionality Wide functionality
The story speaker activity which was done in class 9 can be considered as a script-bot as in that activity
we used to create a script around which the interactive story revolved. As soon as the machine got
triggered by the person, it used to follow the script and answer accordingly. Other examples of script
bot may include the bots which are deployed in the customer care section of various companies. Their
job is to answer some basic queries that they are coded for and connect them to human executives
once they are unable to handle the conversation.
On the other hand, all the assistants like Google Assistant, Alexa, Cortana, Siri, etc. can be taken as
smart bots as not only can they handle the conversations but can also manage to do other tasks which
makes them smarter.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
The sound reaches the brain through a long channel. As a person speaks, the sound travels from his
mouth and goes to the listener’s eardrum. The sound striking the eardrum is converted into neuron
impulse, gets transported to the brain and then gets processed. After processing the signal, the brain
gains understanding around the meaning of it. If it is clear, the signal gets stored. Otherwise, the
listener asks for clarity to the speaker. This is how human languages are processed by humans.
On the other hand, the computer understands the language of numbers. Everything that is sent to the
machine has to be converted to numbers. And while typing, if a single mistake is made, the computer
throws an error and does not process that part. The communications made by the machines are very
basic and simple.
Now, if we want the machine to understand our language, how should this happen? What are the
possible difficulties a machine would face in processing natural language? Let us take a look at some
of them here:
This is the issue related to the syntax of the language. Syntax refers to the grammatical structure of a
sentence. When the structure is present, we can start interpreting the message. Now we also want to
have the computer do this. One way to do this is to use the part-of-speech tagging. This allows the
computer to identify the different parts of a speech.
Besides the matter of arrangement, there’s also meaning behind the language we use. Human
communication is complex. There are multiple characteristics of the human language that might be
easy for a human to understand but extremely difficult for a computer to understand.
Here the way these statements are written is different, but their meanings are the same that is 5.
Here the statements written have the same syntax but their meanings are different. In Python 2.7,
this statement would result in 1 while in Python 3, it would give an output of 1.5.
Think of some other examples of different syntax and same semantics and vice-versa.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Multiple Meanings of a word
Let’s consider these three sentences:
His face turned red after he found out that he took the wrong bag
What does this mean? Is he feeling ashamed because he took another person’s bag instead of his? Is
he feeling angry because he did not manage to steal the bag that he has been targeting?
Here we can see that context is important. We understand a sentence almost intuitively, depending
on our history of using the language, and the memories that have been built within. In all three
sentences, the word red has been used in three different ways which according to the context of the
statement changes its meaning completely. Thus, in natural language, it is important to understand
that a word can have multiple meanings and the meanings fit into the statement according to the
context of it.
Think of some other words which can have multiple meanings and use them in sentences.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
This statement is correct grammatically but does this make any sense? In Human language, a perfect
balance of syntax and semantics is important for better understanding.
Think of some other sentences having correct syntax and incorrect semantics.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
These are some of the challenges we might have to face if we try to teach computers how to
understand and interact in human language. So how does Natural Language Processing do this magic?
Data Processing
Humans interact with each other very easily. For us, the natural languages that we use are so
convenient that we speak them easily and understand them well too. But for computers, our
languages are very complex. As you have already gone through some of the complications in human
languages above, now it is time to see how Natural Language Processing makes it possible for the
machines to understand and speak in the Natural Languages just like humans.
Since we all know that the language of computers is Numerical, the very first step that comes to our
mind is to convert our language to numbers. This conversion takes a few steps to happen. The first
step to it is Text Normalisation. Since human languages are complex, we need to first of all simplify
them in order to make sure that the understanding becomes possible. Text Normalisation helps in
cleaning up the textual data in such a way that it comes down to a level where its complexity is lower
than the actual data. Let us go through Text Normalisation in detail.
Text Normalisation
In Text Normalisation, we undergo several steps to normalise the text to a lower level. Before we
begin, we need to understand that in this section, we will be working on a collection of written text.
That is, we will be working on text from multiple documents and the term used for the whole textual
data from all the documents altogether is known as corpus. Not only would we go through all the
steps of Text Normalisation, we would also work them out on a corpus. Let us take a look at the steps:
Sentence Segmentation
Under sentence segmentation, the whole corpus is divided into sentences. Each sentence is taken as
a different data so now the whole corpus gets reduced to sentences.
Tokenisation
After segmenting the sentences, each sentence is then further divided into tokens. Tokens is a term
used for any word or number or special character occurring in a sentence. Under tokenisation, every
word, number and special character is considered separately and each of them is now a separate
token.
Stopwords are the words which occur very frequently in the corpus but do not add any value to it.
Humans use grammar to make their sentences meaningful for the other person to understand. But
grammatical words do not add any essence to the information which is to be transmitted through the
statement hence they come under stopwords. Some examples of stopwords are:
* Images shown here are the property of individual organisations and are used here for reference purpose only.
These words occur the most in any given corpus but talk very little or nothing about the context or the
meaning of it. Hence, to make it easier for the computer to focus on meaningful terms, these words
are removed.
Along with these words, a lot of times our corpus might have special characters and/or numbers. Now
it depends on the type of corpus that we are working on whether we should keep them in it or not.
For example, if you are working on a document containing email IDs, then you might not want to
remove the special characters and numbers whereas in some other textual data if these characters do
not make sense, then you can remove them along with the stopwords.
Here in this example, the all the 6 forms of hello would be converted to lower case and hence would
be treated as the same word by the machine.
Stemming
In this step, the remaining words are reduced to their root words. In other words, stemming is the
process in which the affixes of words are removed and the words are converted to their base form.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Note that in stemming, the stemmed words (words which are we get after removing the affixes) might
not be meaningful. Here in this example as you can see: healed, healing and healer all were reduced
to heal but studies was reduced to studi after the affix removal which is not a meaningful word.
Stemming does not take into account if the stemmed word is meaningful or not. It just removes the
affixes hence it is faster.
Lemmatization
Stemming and lemmatization both are alternative processes to each other as the role of both the
processes is same – removal of affixes. But the difference between both of them is that in
lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one.
Lemmatization makes sure that lemma is a word with meaning and hence it takes a longer time to
execute than stemming.
As you can see in the same example, the output for studies after affix removal has become study
instead of studi.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Difference between stemming and lemmatization can be summarized by this example:
With this we have normalised our text to tokens which are the simplest form of words present in the
corpus. Now it is time to convert the tokens into numbers. For this, we would use the Bag of Words
algorithm
Bag of Words
Bag of Words is a Natural Language Processing model which helps in extracting features out of the
text which can be helpful in machine learning algorithms. In bag of words, we get the occurrences of
each word and construct the vocabulary for the corpus.
This image gives us a brief overview about how bag of words works. Let us assume that the text on
the left in this image is the normalised corpus which we have got after going through all the steps of
text processing. Now, as we put this text into the bag of words algorithm, the algorithm returns to us
the unique words out of the corpus and their occurrences in it. As you can see at the right, it shows us
a list of words appearing in the corpus and the numbers corresponding to it shows how many times
the word has occurred in the text body. Thus, we can say that the bag of words gives us two things:
2. The frequency of these words (number of times it has occurred in the whole corpus).
Here calling this algorithm “bag” of words symbolises that the sequence of sentences or tokens does
not matter in this case as all we need are the unique words and their frequency in it.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here is the step-by-step approach to implement bag of words algorithm:
Here are three documents having one sentence each. After text normalisation, the text becomes:
Note that no tokens have been removed in the stopwords removal step. It is because we have very
little data and since the frequency of all the words is almost the same, no word can be said to have
lesser value than the other.
Go through all the steps and create a dictionary i.e., list down all the words which occur in all three
documents:
Dictionary:
Note that even though some words are repeated in different documents, they are all written just once
as while creating the dictionary, we create the list of unique words.
In this step, the vocabulary is written in the top row. Now, for each word in the document, if it matches
with the vocabulary, put a 1 under it. If the same word appears again, increment the previous value
by 1. And if the word does not occur in that document, put a 0 under it.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Since in the first document, we have words: aman, and, anil, are, stressed. So, all these words get a
value of 1 and rest of the words get a 0 value.
Same exercise has to be done for all the documents. Hence, the table becomes:
In this table, the header row contains the vocabulary of the corpus and three rows correspond to three
different documents. Take a look at this table and analyse the positioning of 0s and 1s in it.
Finally, this gives us the document vector table for our corpus. But the tokens have still not converted
to numbers. This leads us to the final steps of our algorithm: TFIDF.
__________________________________________________________________________________
__________________________________________________________________________________
Bag of words algorithm gives us the frequency of words in each document we have in our corpus. It
gives us an idea that if the word is occurring more in a document, its value is more for that document.
For example, if I have a document on air pollution, air and pollution would be the words which occur
many times in it. And these words are valuable too as they give us some context around the document.
But let us suppose we have 10 documents and all of them talk about different issues. One is on women
empowerment, the other is on unemployment and so on. Do you think air and pollution would still be
one of the most occurring words in the whole corpus? If not, then which words do you think would
have the highest frequency in all of them?
And, this, is, the, etc. are the words which occur the most in almost all the documents. But these words
do not talk about the corpus at all. Though they are important for humans as they make the
statements understandable to us, for the machine they are a complete waste as they do not provide
us with any information regarding the corpus. Hence, these are termed as stopwords and are mostly
removed at the pre-processing stage only.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Take a look at this graph. It is a plot of occurrence of words versus their value. As you can see, if the
words have highest occurrence in all the documents of the corpus, they are said to have negligible
value hence they are termed as stop words. These words are mostly removed at the pre-processing
stage only. Now as we move ahead from the stopwords, the occurrence level drops drastically and the
words which have adequate occurrence in the corpus are said to have some amount of value and are
termed as frequent words. These words mostly talk about the document’s subject and their
occurrence is adequate in the corpus. Then as the occurrence of words drops further, the value of
such words rises. These words are termed as rare or valuable words. These words occur the least but
add the most value to the corpus. Hence, when we look at the text, we take frequent and rare words
into consideration.
Let us now demystify TFIDF. TFIDF stands for Term Frequency and Inverse Document Frequency. TFIDF
helps un in identifying the value for each word. Let us understand each term one by one.
Term Frequency
Term frequency is the frequency of a word in one document. Term frequency can easily be found from
the document vector table as in that table we mention the frequency of each word of the vocabulary
in each document.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here, you can see that the frequency of each word for each document has been recorded in the table.
These numbers are nothing but the Term Frequencies!
Here, you can see that the document frequency of ‘aman’, ‘anil’, ‘went’, ‘to’ and ‘a’ is 2 as they have
occurred in two documents. Rest of them occurred in just one document hence the document
frequency for them is one.
Talking about inverse document frequency, we need to put the document frequency in the
denominator while the total number of documents is the numerator. Here, the total number of
documents are 3, hence inverse document frequency becomes:
Here, log is to the base of 10. Don’t worry! You don’t need to calculate the log values by yourself.
Simply use the log function in the calculator and find out!
Now, let’s multiply the IDF values to the TF values. Note that the TF values are for each document
while the IDF values are for the whole corpus. Hence, we need to multiply the IDF values to each row
of the document vector table.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here, you can see that the IDF values for Aman in each row is the same and similar pattern is followed
for all the words of the vocabulary. After calculating all the values, we get:
Finally, the words have been converted to numbers. These numbers are the values of each for each
document. Here, you can see that since we have less amount of data, words like ‘are’ and ‘and’ also
have a high value. But as the IDF value increases, the value of that word decreases. That is, for
example:
Which means: log(3.3333) = 0.522; which shows that the word ‘pollution’ has considerable value in
the corpus.
1. Words that occur in all the documents with high term frequencies have the least values and
are considered to be the stopwords.
2. For a word to have high TFIDF value, the word needs to have a high term frequency but less
document frequency which shows that the word is important for one document but is not a
common word for all documents.
3. These values help the computer understand which words are to be considered while
processing the natural language. The higher the value, the more important the word is for a
given corpus.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Applications of TFIDF
TFIDF is commonly used in the Natural Language Processing domain. Some of its applications are:
Document Information
Topic Modelling Stop word filtering
Classification Retrieval System
DIY – Do It Yourself!
Here is a corpus for you to challenge yourself with the given tasks. Use the knowledge you have
gained in the above sections and try completing the whole exercise by yourself.
The Corpus
Document 1: We can use health chatbots for treating stress.
Document 2: We can use NLP to create chatbots and we will be making health chatbots now!
Document 3: Health Chatbots cannot replace human counsellors now. Yay >< !! @1nteLA!4Y
Accomplish the following challenges on the basis of the corpus given above. You can use the tools
available online for these challenges. Link for each tool is given below:
2. Tokenisation: https://text-processing.com/demo/tokenize/
5. Stemming: http://textanalysisonline.com/nltk-porter-stemmer
6. Lemmatisation: http://textanalysisonline.com/spacy-word-lemmatize
What is evaluation?
Evaluation is the process of understanding the reliability of any AI model, based on outputs by feeding
test dataset into the model and comparing with actual answers. There can be different Evaluation
techniques, depending of the type and purpose of the model. Remember that It’s not recommended
to use the data we used to build the model to evaluate it. This is because our model will simply
remember the whole training set, and will therefore always predict the correct label for any point in
the training set. This is known as overfitting.
Firstly, let us go through various terms which are very important to the evaluation process.
The Scenario
Imagine that you have come up with an AI based prediction model which has been deployed in a forest
which is prone to forest fires. Now, the objective of the model is to predict whether a forest fire has
broken out in the forest or not. Now, to understand the efficiency of this model, we need to check if
the predictions which it makes are correct or not. Thus, there exist two conditions which we need to
ponder upon: Prediction and Reality. The prediction is the output which is given by the machine and
the reality is the real scenario in the forest when the prediction has been made. Now let us look at
various combinations that we can have with these two conditions.
Case 1: Is there a forest fire?
Here, we can see in the picture that a forest fire has broken out in the forest. The model predicts a Yes
which means there is a forest fire. The Prediction matches with the Reality. Hence, this condition is
termed as True Positive.
Here there is no fire in the forest hence the reality is No. In this case, the machine too has predicted
it correctly as a No. Therefore, this condition is termed as True Negative.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Case 3: Is there a forest fire?
Here the reality is that there is no forest fire. But the machine has incorrectly predicted that there is
a forest fire. This case is termed as False Positive.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Here, a forest fire has broken out in the forest because of which the Reality is Yes but the machine has
incorrectly predicted it as a No which means the machine predicts that there is no Forest Fire.
Therefore, this case becomes False Negative.
Confusion matrix
The result of comparison between the prediction and reality can be recorded in what we call the
confusion matrix. The confusion matrix allows us to understand the prediction results. Note that it is
not an evaluation metric but a record which can help in evaluation. Let us once again take a look at
the four conditions that we went through in the Forest Fire example:
Prediction and Reality can be easily mapped together with the help of this confusion matrix.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Evaluation Methods
Now as we have gone through all the possible combinations of Prediction and Reality, let us see how
we can use these conditions to evaluate the model.
Accuracy
Accuracy is defined as the percentage of correct predictions out of all the observations. A prediction
can be said to be correct if it matches the reality. Here, we have two conditions in which the Prediction
matches with the Reality: True Positive and True Negative. Hence, the formula for Accuracy becomes:
Here, total observations cover all the possible cases of prediction that can be True Positive (TP), True
Negative (TN), False Positive (FP) and False Negative (FN).
As we can see, Accuracy talks about how true the predictions are by any model. Let us ponder:
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Let us go back to the Forest Fire example. Assume that the model always predicts that there is no fire.
But in reality, there is a 2% chance of forest fire breaking out. In this case, for 98 cases, the model will
be right but for those 2 cases in which there was a forest fire, then too the model predicted no fire.
Here,
True Positives = 0
True Negatives = 98
* Images shown here are the property of individual organisations and are used here for reference purpose only.
This is a fairly high accuracy for an AI model. But this parameter is useless for us as the actual cases
where the fire broke out are not taken into account. Hence, there is a need to look at another
parameter which takes account of such cases as well.
Precision
Precision is defined as the percentage of true positive cases versus all the cases where the prediction
is true. That is, it takes into account the True Positives and False Positives.
Going back to the Forest Fire example, in this case, assume that the model always predicts that there
is a forest fire irrespective of the reality. In this case, all the Positive conditions would be taken into
account that is, True Positive (Prediction = Yes and Reality = Yes) and False Positive (Prediction = Yes
and Reality = No). In this case, the firefighters will check for the fire all the time to see if the alarm was
True or False.
You might recall the story of the boy who falsely cries out that there are wolves every time and so
when they actually arrive, no one comes to his rescue. Similarly, here if the Precision is low (which
means there are more False alarms than the actual ones) then the firefighters would get complacent
and might not go and check every time considering it could be a false alarm.
* Images shown here are the property of individual organisations and are used here for reference purpose only.
This makes Precision an important evaluation criteria. If Precision is high, this means the True Positive
cases are more, giving lesser False alarms.
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Let us consider that a model has 100% precision. Which means that whenever the machine says
there’s a fire, there is actually a fire (True Positive). In the same model, there can be a rare exceptional
case where there was actual fire but the system could not detect it. This is the case of a False Negative
condition. But the precision value would not be affected by it because it does not take FN into account.
Is precision then a good parameter for model performance?
Recall
Another parameter for evaluating the model’s performance is Recall. It can be defined as the fraction
of positive cases that are correctly identified. It majorly takes into account the true reality cases where
in Reality there was a fire but the machine either detected it correctly or it didn’t. That is, it considers
True Positives (There was a forest fire in reality and the model predicted a forest fire) and False
Negatives (There was a forest fire and the model didn’t predict it).
* Images shown here are the property of individual organisations and are used here for reference purpose only.
Now as we notice, we can see that the Numerator in both Precision and Recall is the same: True
Positives. But in the denominator, Precision counts the False Positives while Recall takes False
Negatives into consideration.
Let us ponder… Which one do you think is better? Precision or Recall? Why?
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a deadly virus has
started spreading and the model which is supposed to predict a viral outbreak does not detect it. The
virus might spread widely and infect a lot of people.
On the other hand, there can be cases in which the False Positive condition costs us more than False
Negatives. One such case is Mining. Imagine a model telling you that there exists treasure at a point
and you keep on digging there but it turns out that it is a false alarm. Here, False Positive case
(predicting there is treasure but there is no treasure) can be very costly.
Similarly, let’s consider a model that predicts that a mail is spam or not. If the model always predicts
that the mail is spam, people would not look at it and eventually might lose important information.
Here also False Positive condition (Predicting the mail as spam while the mail is not spam) would have
a high cost.
Think of some more examples having:
To conclude the argument, we must say that if we want to know if our model’s performance is good,
we need these two measures: Recall and Precision. For some cases, you might have a High Precision
but Low Recall or Low Precision but High Recall. But since both the measures are important, there is
a need of a parameter which takes both Precision and Recall into account.
F1 Score
F1 score can be defined as the measure of balance between precision and recall.
Take a look at the formula and think of when can we get a perfect F1 score?
An ideal situation would be when we have a value of 1 (that is 100%) for both Precision and Recall. In
that case, the F1 score would also be an ideal 1 (100%). It is known as the perfect value for F1 Score.
As the values of both Precision and Recall ranges from 0 to 1, the F1 score also ranges from 0 to 1.
Let us explore the variations we can have in the F1 Score:
In conclusion, we can say that a model has good performance if the F1 Score for that model is high.
Let’s practice!
Let us understand the evaluation parameters with the help of examples.
Challenge
Find out Accuracy, Precision, Recall and F1 Score for the given problems.
Scenario 1:
In schools, a lot of times it happens that there is no water to drink. At a few places, cases of water
shortage in schools are very common and prominent. Hence, an AI model is designed to predict if
there is going to be a water shortage in the school in the near future or not. The confusion matrix for
the same is:
Scenario 2:
Nowadays, the problem of floods has worsened in some parts of the country. Not only does it damage
the whole place but it also forces people to move out of their homes and relocate. To address this
issue, an AI model has been created which can predict if there is a chance of floods or not. The
confusion matrix for the same is:
Scenario 3:
A lot of times people face the problem of sudden downpour. People wash clothes and put them out
to dry but due to unexpected rain, their work gets wasted. Thus, an AI model has been created which
predicts if there will be rain or not. The confusion matrix for the same is:
Scenario 4:
Traffic Jams have become a common part of our lives nowadays. Living in an urban area means you
have to face traffic each and every time you get out on the road. Mostly, school students opt for buses
to go to school. Many times the bus gets late due to such jams and students are not able to reach their
school on time. Thus, an AI model is created to predict explicitly if there would be a traffic jam on their
way to school or not. The confusion matrix for the same is: