0% found this document useful (0 votes)
91 views

Educator Guide - Module 2 Machine Learning PDF

This document provides an overview of machine learning and its applications. It discusses machine learning as the foundation of artificial intelligence, where machines learn from data to improve automatically without being explicitly programmed. The document covers understanding data and datasets, design thinking and problem identification, chatbots, supervised and unsupervised machine learning. It aims to help students learn the basics of machine learning and how it is used in areas like sentiment analysis, problem solving, and developing chatbots through various machine learning techniques.

Uploaded by

Rjsjjsajja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

Educator Guide - Module 2 Machine Learning PDF

This document provides an overview of machine learning and its applications. It discusses machine learning as the foundation of artificial intelligence, where machines learn from data to improve automatically without being explicitly programmed. The document covers understanding data and datasets, design thinking and problem identification, chatbots, supervised and unsupervised machine learning. It aims to help students learn the basics of machine learning and how it is used in areas like sentiment analysis, problem solving, and developing chatbots through various machine learning techniques.

Uploaded by

Rjsjjsajja
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Educator Handbook

Machine Learning
Module 2
Table of Contents
Learning Objectives ..................................................................................................................................... 4
Machine Learning – The foundation of Artificial Intelligence............................................................ 6
Machine Learning ..................................................................................................................................................................... 8
The Need for Machine Learning ......................................................................................................................................... 9
Understanding Data and Datasets ........................................................................................................... 11
Data and Its Utility.................................................................................................................................................................. 11
Use of Data in Machine Learning ..................................................................................................................................... 13
Different Types of Datasets ................................................................................................................................................ 13
Sentiment Analysis ................................................................................................................................................................. 15
Design Thinking, Problem Identification, & Working with Data .................................................... 22
Design Thinking ...................................................................................................................................................................... 22
Problem Identification .......................................................................................................................................................... 24
Development and Understanding of the BOT Framework ...............................................................27
What is a BOT?......................................................................................................................................................................... 27
What can a BOT do? .............................................................................................................................................................. 28
The BOT Framework .............................................................................................................................................................. 29
Data Labeling ........................................................................................................................................................................... 30
Machine Learning and CHATBOTs ......................................................................................................... 32
Robots and Humans .............................................................................................................................................................. 32
BOTS – Redefining Workplaces ......................................................................................................................................... 33
The Science behind the Generation of a BOT .............................................................................................................. 33
Building blocks - Program for a BOT .............................................................................................................................. 34
Demonstration ......................................................................................................................................................................... 35
Introduction to Supervised Machine Learning ................................................................................... 36
Major machine learning methods .................................................................................................................................... 36
Supervised learning ............................................................................................................................................................... 36
Semi-Supervised learning.................................................................................................................................................... 37
Reinforcement learning ........................................................................................................................................................ 37
Supervised Machine Learning ............................................................................................................................................ 38
Classroom Activity .................................................................................................................................................................. 39

2
Introduction to Unsupervised Machine Learning .............................................................................. 40
Unsupervised learning .......................................................................................................................................................... 40
Clustering................................................................................................................................................................................... 41
Assessments Questions ............................................................................................................................. 42
Questions to consider................................................................................................................................ 45
Some Practical Assignments/Lab Work ................................................................................................ 47
Practical Assignments ................................................................................................................................. 51
Further Reading .......................................................................................................................................... 52
Reference Links ........................................................................................................................................... 53
Glossary ......................................................................................................................................................... 55

3
Disclaimer:

The Imagine Cup Junior guides and lesson materials are created by Microsoft and our partners
and intended to be for guidance only to support with the Imagine Cup Junior Challenge. For
the latest on Microsoft AI please visit https://www.microsoft.com/en-us/ai

4
Learning Objectives
Through this module, students will get an overview of machine learning and understand how it
provides the foundation of AI. Students should be able to understand the basics of machine
learning and use the concepts as applied to their daily life.
At the conclusion of the module, students should be able to:
• Understand the basics of machine learning.
• Comprehend the basics of using datasets and working with data.
• Understand the machine learning approach to problem-solving, and devise solutions to
problems.
• Comprehend the latest design-related perspectives, ideas, concepts, and solutions.
• Understand the importance of data and ways to protect it.
• Execute projects using design thinking principles.
• Understand and analyze data related problems.
• Comprehend the basics of creating a BOT and the related working framework.
• Understand the various challenges of creating a BOT.
• Appreciate the similarities and differences between a machine-driven BOT and a human.
• Understand the concept of cluster algorithms and apply the principle of ‘clustering’ on data.

5
Machine Learning – The foundation of Notes for the
educator
Artificial Intelligence The educator could
Often used interchangeably with artificial intelligence, machine make students
learning, however, has a different meaning. It is the ‘learning’ that the understand the
machine derives from its experience in processing data. The primary concept of machine
learning and its history,
objective of machine learning is to ensure that the machine learns from mentioning the
the data. In other words, machine learning is an application of artificial applications in
intelligence (AI) that provides computer systems with the ability to futuristic vehicles for
automatically learn and improve from experience without being space exploration.
explicitly programmed. Machine learning as a science focuses on the Here are two links that
development of computer programs that can access data and use it to the educator could use
learn for themselves. This is sometimes known as heuristic to understand and
programming. explain the concept to
the students.
Machine learning can also be defined as the study of computer-based
algorithms designed to automatically improve the experience through Video: How Machine
Learning works?
acquired learning. Machines are created with a built-in capability to
read and understand human language to comprehend their Video: What is machine
surroundings and make as many accurate predictions as they can. They learning, and how does
it work?
can also perform simultaneous real-time assessments of predictions
and adapt according to their environment. When a user wants to Note – these are
search a topic, the search engine shows up the most frequently suggested links. The
searched related ‘search topics’. The search engine looks at past clicks educator is free to use
other means or media
from people around the world in order to understand the pages that to convey an
are more relevant for those searches than others. It then serves those understand or concept
results a list with the most relevant being at the top. It should be noted of machine learning to
that such an exercise is impossible to be performed by humans in the themselves and
time frame of a few seconds. students.

The machine learns how to handle search requests and generate a set
of instructions to create the expected outcome. Hence, machine
learning can also be understood as a set of procedures, which deals
with huge amounts of data smartly (using algorithms or a set of logical
rules) to derive results.

6
Notes for the educator
The educator can explain the concept of machine learning and how it supplements the creation of a BOT
using vast amounts of data. Students are not expected to learn how to create a BOT as it is beyond their
current level of education. However, the theoretical element to data collection and design, review of
samples, and test cases of BOT creation is what the students should accomplish.
It is recommended to quote examples such as:
Social Media:
What are you interested in: Machine learning works on the simple concept of understanding based on its
experiences. The Microsoft News App, for example, continuously looks at what is trending, what you read
most frequently and based on this data, ‘learns’ what you like to look at and what interests you and
populates the pages you read with the most relevant news items from across the world based on your
preferences.
E-commerce:
You shopped for a product online a few days ago but then keep receiving emails from the store and others
with shopping suggestions. If not this, then you may notice that the shopping website or the e-commerce
app you use recommends some items that somehow match with your taste. Certainly, this helps you
potentially with your shopping experience but did you know that it’s machine learning doing the ’magic’
behind the scenes? Based on your shopping behavior using past purchases, items liked or added to cart,
brand preferences, etc., other product recommendations are made and presented to you.
At this point the educator should prompt students to explore other examples of machine learning in their
day-to-day life.

7
Machine Learning
Whilst you are all, by now, familiar with artificial intelligence (AI), machine learning is a specific
subset of AI which simply trains a machine on how to learn. It is an application of artificial
intelligence that provides computer systems with the ability to spontaneously learn and
improve based on its experience without being explicitly programmed.

The process of machine learning begins with analyzing observations (data) such as examples,
direct experiences, or instructions, and looks for patterns in the data. Based on this analysis and
the cumulative data it was provided, it learns to make better decisions in the future. The
primary goal is to allow the machine to learn automatically without human intervention or
assistance and adjust actions accordingly.

Fig 2.1: Phase1 of Machine Learning

Fig 2.2: Phase 2 of Machine Learning

8
The Need for Machine Learning
The reason behind machine learning is to automate mundane tasks to the extent that the
machine can learn, think and make smart decisions on its own. It is also to minimize human
interference and thus bias in various scenarios. The need for machine learning is to complete
tasks that are too complex for humans to code computers for directly. Some tasks are so
complex that it is impractical, if not impossible, for humans to cater for all the nuances and
code for every single instance separately. Instead, a large amount of data is provided to a
machine learning algorithm and the algorithm computes the result by exploring the data and
constructing a model that will achieve the desired outcome.
Machine learning is also useful for finding relationships between things, especially in
exceptionally large datasets which are too big for humans to process efficiently. Its uses here
are in object recognition, marketing analytics, analyzing scientific data in labs, and numerous
other applications that involve large amounts of data needing to be analyzed.

Fig 2.3: Approach towards Traditional Programming and Machine Learning

9
The key difference between traditional programming and machine Notes for the
learning is that in traditional programming, the data and the program are educator
run on the computer to produce an output. However, in machine
The educator can
learning, the data and the output are fed into a computer to create a
demonstrate the
program. This program can be then used in the same way as one created difference between
by traditional programming. machine learning and
A few examples of Machine Learning in our day-to-day life are: traditional
programming. Whilst
• Cortana traditional
• Refined search engine results (as represented in fig 2.4) programming is about
using a program and
data to create an
output, machine
learning is fed the data
and the expected
outcome and
generates a program
that can be improvised
as the data or the
scenario changes.

Fig 2.4: Search Engine Result Refining, a typical example of machine learning

10
Understanding Data and Datasets
Data and Its Utility
This is Datum

Fig 2.5: Data

Data can be defined as the collection of facts, numbers or other information that are used
either for reference, or analysis. The singular form of Data is Datum.
In Figure 2.5, the task is to compute the cumulative percentage of each student’s marks. The
percentage obtained in an exam by a student is calculated by the sum of all the marks
obtained in different subjects divided by the number of subjects. Therefore, the marks are
important for calculating the percentage, and to arrive at the result it is important to take into
account the marks obtained by each student in each subject.
Notes for the educator
The educator can ask the students – What do they understand by ‘data’? What is their definition of data?
Have they ever collected any data? If yes, then why?
Then provide the students with an example – Suppose you are celebrating your friend’s birthday party. What
are all the things you need to consider for complete that task? Probable answers would be – making a list of
guests, a menu, list of furniture, list of games to be played at the party, list of songs to be played, birthday
theme, deciding the venue, etc. All these are examples of data. Data is another word for information.
The educator can explain to the students that this is a very basic example of data. However, in an
organization and the real world there are different types of data used for different purposes. The educator
can demonstrate what data is and what constitutes a dataset.

11
Dataset
A dataset is defined as a collection, or group, of data where every column denotes a particular
variable and each row relates to a specific member of that dataset.

This is a Dataset

Fig 2.6: A Collection of Data is called a Dataset

Datasets are needed to create the learning algorithm the machine uses in a particular context
of Artificial Intelligence. The method adopted by machines to learn is to use automatic data
analysis for building concepts. The whole model of machine learning is built on the premise
that systems can be programmed to learn from the data they receive as input. This is done
through the identification of patterns to make informed decisions with minimal human
intervention.
The entire process begins with the input of data into the machine. Data can be accumulated
either through datasets or real-time data through physical sensors such as cameras,
temperature sensors, microphones etc. The machine is equipped to understand and analyze
patterns and then perform certain tasks using those patterns as references. The machine works
iteratively, which as the model is exposed to various kinds of data makes it capable of adapting
itself independently.
The machine learning, like humans, comes from earlier results on similar scenarios and thus the
application learning improves the more it is used. This leads to more trustworthy results, but
remember, the trustworthiness of the output is only as robust as the datasets it has been
provided with over time.

12
Use of Data in Machine Learning Notes for the
educator
Data can be in the form of text, images, numbers, and even sound or
video. The datasets are analyzed to create an experience which in turn The educator can use
the diagram below to
is used to create a form of machine learning program.
help students
understand the process
of feeding the machine
input data and
receiving the expected
outcomes. The
educator can explain
that a machine learning
program is intelligent
enough to learn from
the data it is fed (Input)
and the expected
outcome.
This creates a pattern
of learning with which
to create the most
optimum model that
can cater for any other
similar data with similar
Fig 2.7: Machine Learning
kinds of expected
Image Source - https://quantdare.com/machine-learning-a-brief-breakdown/
outputs.

Different Types of Datasets


Different kinds of datasets are used to achieve a particular machine
learning objective. Here are a few of them:
• Image Processing
• Sentiment Analysis
• Natural Language Processing
• Video Processing
• Speech Recognition
• Internet of Things (IoT)

13
Image Processing
Datasets for image processing can be used for object captioning, detection and segmentation
of the dataset (Maj, 2019).

Fig 2.8: Datasets for image processing


Image Source - https://www.kdnuggets.com/2018/09/object-detection-image-classification-yolo.html

Fig 2.9: Segmentation and Captioning through Machine Learning


Image Source - https://towardsdatascience.com/faster-r-cnn-object-detection-implemented-by-keras-for-custom-data-
from- a browsers-open-images-125f62b9141a

A Dataset of a variety of facial expressions is used to understand expression and caption the
image accordingly. Such as a happy or sad face.

14
Sentiment Analysis Notes for the
educator
Whilst one parameter may be Object Recognition, another is that of
the human sentiment. This layer of ‘sentiment analysis’ when put into The educator can
conduct an activity by
context can categorize the various human emotions as a datatype and
asking students to
its intensity. The algorithms used for the analysis of the human volunteer to
sentiments are advanced and designed to generate accurate and demonstrate various
useful results. Examples of the use of this can be to analyze the emotions that they can
sentiment of a customer. think of, or to even
imitate any emoticon
of their choice, whilst
the rest of the class try
to decipher which
emoticon they are
portraying? Alternately
the educator can use a
printout of common
emoticons, show it to
the class and prompt
students to name the
emotions they
Fig 2.10: Emotions depicted in a smiley
represent.
By analyzing sentiment accurately, and in particular when people are This is to drive the fact
unhappy, the application can focus on actions that could alter the that humans recognize
person’s emotion. This could be used for good in supporting people emoticons easily and
with certain mental illnesses, or for a not so good purpose such as associate a name or a
convincing someone to purchase a product they may not wish too. tag to it in the same
Remember it is not the technology but how it is used that is way that machines do.
The machine learning
important.
algorithm memorizes
the image and tags it
with emotions and
words to identify it in
future applications.

15
Fig 2.11: Steps of Sentiment Analysis of an Ecommerce Platform

How is this achieved? Two `Polarity’ nodes are created; one for a positive sentiment, the other
for a negative sentiment. This is done to assist in identifying the right sentiment of a person.
The words associated with a given polarity node are then re-submitted to the algorithm for
more accurate sentiment analysis.

Fig 2.12: Polarity Nodes for Sentiment Analysis

16
Natural Language Processing (NLP)
NLP uses language comprehension to train the machine to adapt to natural changes in
language such as the addition of new words, editing of words to suit the current context,
labeling new words according to usage, and deleting words that have become obscure and no
longer in use.
Natural language processing can be defined as a technology which enables the machine to
comprehend human language in the way it is spoken and understood by humans. One of the
most striking aspects is that many natural languages such as English, French, German, and
Mandarin Chinese, etc. keep evolving and the learning element of the machines adapts to this.
There is no fixed or permanent structure to language, thus making it very flexible. Different
kinds of dialects and sub-dialects are spoken in different regions and each one of these is
slightly different from the other. This includes the use of slang and other urban or sub-cultural
languages.

Fig 2.13: Steps of Sentiment Analysis of a Yammer Discussion

17
Fig 2.14: Example of Sentiment Analysis of a Yammer discussion

In Fig 2.14, the AI software takes words from a micro blogging website and associates them
with a particular sentiment. This enables it to identify the overall tone of the conversation.

18
Video Processing
In this example the AI Software takes screenshots at regular intervals from a live video stream
and analyses it to count the number of people who are about to get onto the bus. By
recognizing other objects it can also calculate other parameters such as the frequency of buses
arriving at the bus stop, or automatically identifying crowd rush hours across the day. This data
can then be used to manage more efficient public transport.

Fig 2.15: Video Processing

Speech Recognition
Speech recognition can be defined as a technology which enables the recognition of the
spoken word and subsequent translation into text. The machine learns ways of identifying and
analyzing various human voices, both live and recorded, and processes them accordingly, such
as the conversion into text for dictation purposes in word-processors such as Microsoft Word.
It is also used to understand the user in voice-activated modules in automated cars, and in the
important role of assisting those people with disabilities.
Internet of Things (IoT)
The Internet of Things (IoT) makes reference to the countless networked devices we use to
make our lives easier. These devices rely on the Internet to gather and share data from
responsible sources in order to provide ‘smart’ services. With these huge datasets and massive
amounts of data sources becoming a reality, machine learning has become an integral part of
our daily lives.

19
Machine learning can be applied in almost all scenarios where the outcome is known. It can
however also be applied where the datasets are unknown and in situations where there are
repeated forms of the same sort of data which can be used to reinforce the machine learning.
For example, machine learning can help in understanding and analyzing the patterns of waves
and oceanic currents in order to predict future sea temperatures, monsoon patterns, and even
the potential for a cyclone or other natural disaster in a specific geographical location.
Capturing IoT and Sensor Data
The Internet of Things (IoT) is more of a concept than an actual thing. The concept is to allow
us to interpret data from networked sensors or devices in the most meaningful ways possible.
The aim is to measure, analyze, visualize, predict, and react to the data accumulated from these
sensors. One form of IoT most people are familiar with is a smart thermostat, smart switches, or
other internet-connected devices and appliances in your house. These are generally considered
part of Consumer IoT. Then there's Industrial IoT, or IIoT. This includes things like the use of IoT
devices in smart buildings, industrial automation, and monitoring of industrial processes.
Processing IoT Data
Processing the data from connected IoT sensors requires time and many interactions with sub-
procedures such as:
• Standardizing or transforming the data into a uniform format to ensure it is compatible with
your application.
• Creating and Storing a backup of the newly transformed data.
• Removing any repetitive, outdated, or unwanted data to help improve accuracy.
• Integration with additional structured (or unstructured) data from other sources to help enrich
the dataset.
IoT Data Analytics
When we apply data analysis tools or procedures to different types of IoT data, the process is
called IoT analytics. This process is performed on huge datasets to improve the efficiency of
procedures, applications, business processes, and production. Several types of data analytics
can be used on IoT data:
Prescriptive analytics
Prescriptive analytics is used to analyze what steps to take in a specific situation. It’s often
described as being a combination of descriptive and predictive analysis. When used in
commercial applications, prescriptive analytics helps decipher large amounts of information to
obtain more precise conclusions.
Spatial analytics
This is used to analyze location based data. Spatial analytics deciphers various geographic
patterns, determining any type of spatial relationship between various physical objects. Parking
applications, smart cars, and crop management are all examples of applications that benefit
from spatial analytics.

20
Streaming analytics
Streaming analytics, sometimes referred to as event stream processing, is the analysis of
massive datasets of moving images. These real-time data streams can be analyzed to detect
emergency or urgent situations, facilitating an immediate response. The types of IoT
applications that benefit from streaming analytics include those used in traffic analysis and air
traffic control, and CCTV by Police.
Time series analytics
Time series analytics is based on time-based data, which is analyzed to show any anomalies,
patterns, or trends. Two systems that greatly benefit from time series analytics are health and
weather-monitoring systems.
We are surrounded by IoT data in our homes, our cars, and in our schools. The amount of data
that IoT technology produces is massive. By collecting, processing, and analyzing this data, we
can gain valuable insights to help us make better decisions about their future.
The following links give access to free datasets of IoT and sensor-based data for you to
download.
• https://data.world/datasets/iot
• https://hub.packtpub.com/25-datasets-deep-learning-iot/
• https://www.kaggle.com/uciml/biomechanical-features-of-orthopedic-patients
• https://www.datasciencecentral.com/profiles/blogs/great-sensor-datasets-to-prepare-your-
next-career-move-in-iot-int

21
Design Thinking, Problem Notes for the educator

Identification, & Working with Data The educator can prompt


students to come up with various
applications of AI that the
Design Thinking students can think of as being
used in daily life. Suggested
Design thinking is a non-linear iterative process designed to help answers could be music
understand a problem, the users it affects, assumptions made, and streaming, social networking, and
any available solutions keeping in mind all the parameters. Design e-commerce sites with predictive
teams are responsible for implementing solutions to problems recommendations. The educator
within a specific time frame. Design thinking is therefore a can later introduce an activity of
looking at these applications as a
problem-solving methodology aimed at devising solutions. solution to particular problems to
Over the years, ‘design thinking’ has gained prominence in terms simplify human life. It is
of its effectiveness in solving problems or finding as many recommended that the educator
designs various problems that
alternative solutions as possible. Organizations such as Microsoft
can be based on simple real life
use it successfully to design and create products. Other situations and students can be
organizations such as Universities, Banks and other companies also asked to suggest potential
use it to help solve real-life problems in their industry sectors. solutions. For example, the
educator can prompt the
The various stages of design thinking are a part of an iterative
students to think about how AI
process where the ultimate objective is to acquire an in-depth could help improve the quality of
understanding of the problem and suggest a single solution or education, healthcare, defense,
alternatives within specific boundaries. agriculture, retail, etc. Students
also discuss where they can (or
Stage 1: Empathize—Research Your User’s Needs wish to) apply AI in their personal
This stage permits the team to gain an understanding of the real-life to solve individual
problems.
problem, and to what extent it affects the user. It requires gaining
an empathetic understanding of the user’s issues. Empathy, or the For a better understanding of
ability to imagine oneself in the condition of another, is vital in any ‘Design thinking’ you can
download (pdf format) from the
human-centric design process as the process is not to be based on
following link:
the team’s assumptions but in terms of the user’s perspective.
https://www.designbetter.co/desi
Stage 2: Define—State Your User’s Needs and Problems gn-thinking
In this stage, information is accumulated from the previous stage The educator can show the
for analysis. The team makes its observations and a definition of following video which illustrates
the core problem is written based on these aspects. This is known how a primary school has used
design thinking with their
as the Problem Statement and the most important aspect of this is
students and how they used
to understand that the problem statement should be as human- every step of design thinking to
centric as possible. We will teach you how to create a Problem implement a solution of their
Statement in depth later in the course. own:
https://www.edutopia.org/video/
design-thinking-problem-solving-
framework

In a class activity, students can


empathize with others around the
world in solving real-world 22
problems.
Stage 3: Ideate—Challenge Assumptions and Create Ideas Notes for the
Once the design team enters this third stage; they completely educator
understand the user, the matters of concern to the user, and have For a better
defined the problem as exactly as they can. Now is the time for the understanding of
team to come up with innovative ideas by thinking creatively and ‘Design thinking’
‘outside-the-box’. please download a
book (pdf format) from
Stage 4: Prototype—Start to Create Solutions the following link:

This next stage is an experimental phase with the sole objective of https://www.designbett
er.co/design-thinking
identifying the best possible solution that can be provided to solve the
problem. The aim is to find solutions that are possible, inexpensive, and The educator can show
achievable. the following video
which illustrates how a
Stage 5: Test—Try Your Solutions Out primary school has
used design thinking
At this final stage the team test their solutions to check their feasibility with their students and
and recommend the best possible one. As an iterative process, the how they used every
results obtained may be used to redefine one or more of the solutions step of design thinking
identified by choosing to return to previous stages in the process in to implement a
order to make further changes, refinements, or to rule out particular solution of their own:
alternative solutions. https://www.edutopia.o
rg/video/design-
thinking-problem-
solving-framework
In a class activity,
students can
empathize with others
around the world in
solving real-world
problems.

23
Problem Identification Notes for the
educator
Creating the Problem Statement
Educator can ask
A problem statement should be designed to address the Five Ws (Who, students to identify a
What, When, Where and Why). A simple and well-defined problem real-life problem and
statement is often used by every member of a project team to try to solve it using the
understand the problem and work together toward developing a design thinking
solution. techniques taught
earlier.
The reason for the existence of a problem statement is the
identification and explanation of the problem itself and as such
includes a description of the environment where the problem exists, and the impact that it has
on other elements such as users, finances, resource allocation, or additional activities. A
problem statement also explains the anticipated environment the solution is to run in. This
definition of the problem helps create a holistic overview and to define the problem in an
elaborate but clear manner. Furthermore, the project goals to be accomplished, and the
purpose for initiating a project can also be written clearly without doubt, ambiguity, or
uncertainty of any kind.
Another useful purpose of creating a problem statement is that it can be used as a
communications mechanism to others. It helps the project team identify any support staff and
other kinds of expertise that may be needed to complete the project. Before the start of the
project, the people involved need to understand the problem, and goals, not only from a team
perspective but also from an individual contributor’s perspective. This step makes it clear what
the goal of the project is as well as the role that each team member will play in the execution
of the project.
Defining the boundary of the problem
Every problem has limitations to an extent beyond which the solution devised to solve the
problem is not applicable. The extent of these limitations is known as the boundary of the
problem. In reality, there is no physical boundary, but more an understanding that exists
between each team member.
To define the boundary of a problem it is important to focus on the real issues that make up
the problem. This can be achieved only when there is a thorough clarification of these issues. It
is understood by all team members that the boundary is a clear demarcation between the
factors that will greatly impact the problem and lesser-affecting factors. Lesser-affecting factors
are not considered to be within the boundaries of the problem definition and are thus not
considered when creating the solution.

24
It is important to understand that for each person dealing with devising solutions for a
problem, or set of problems, that the problem boundary will differ. This is based on their
understanding of the problem. Their understanding will also be affected by their concerns and
any underlying human biases. This may pose slight setbacks in the project. However, as the
design team works cohesively, any biases are more often than not taken care of early in the
process.
How will data be accessed, managed and analyzed?
Machine learning is an especially important component of artificial intelligence which relies
heavily on data. What a machine learning device will achieve or not is based on the kind of
data it is given as input. Hence, it is vitally important that the input data is acquired from
sources that are reliable and are not tampered with or altered in any way.
Every algorithm needs to be fed a particular kind of data depending on the expected outcome
to be performed by the machine. Training any AI-based algorithm often requires thousands or
even millions of points of information. The data required often may be unavailable as access to
it may conflict with privacy rules or government regulations on sharing data. In other words,
gaining access to data is a complicated process. Hence, it is required to have best practices
policies in place to ensure that all AI related systems follow the rules of privacy and security
uniformly.
What are the privacy and security aspects around the data collection?
There have been incidents in the past that have raised questions as to the sanctity of data
being used by machine learning devices. It is mandatory for the organization handling the data
to ensure that the security of that data is maintained and is far out of reach from people with
malicious intentions.
The amount of data collected and stored each day is enormous. Data organizations gather data
from innumerable sources such as live data sources, blogs, social media, and other sources,
which is quite an extensive task and therefore we need to have strong, robust data
management systems and stringent laws to protect it from misuse of any kind.
Access to data
Data Access Statements are statements that are created to document the datasets required to
support a specific purpose and the necessary conditions under which they can be found and
used.
Research data archive and repository organizations provide users with a permanent identifier
for data they housed known as a Digital Object Identifier (DOI) or accession number.

25
Openly available data
The following are things that should be provided in a Data Access Statement:
• The name(s) of the repositories/archives of the dataset(s), managing the dataset(s)
• The persistent identifier for your dataset
• Data subject to access restrictions
• Justification for the data to be subject to access restrictions (for example, ethical, legal, or
commercial sensitivity)
• Information on arrangements for accessing the data, including the persistent identifier to the
dataset, or a statement that the data are not accessible
• If you have used secondary or third-party data information the data source should be credited
• If you have used secondary or third-party data, you can provide information on how the data
were accessed
Data Access Statements can also be combined with formal data citations, particularly when a
publication is supported by multiple datasets in different locations.

26
Development and Understanding of the BOT
Framework
What is a BOT?
Have you ever watched someone in your family do shopping from an e-commerce website?
Often, you will find that a small window pops up at an extreme corner of the screen offering
assistance in case they are facing difficulty with shopping or making a payment. This
application holds a chat with the human user in a human-like manner and is known generically
as a BOT, an abbreviation of Robot.

Fig 2.16: A user communicating with a chat BOT of an online shopping site

A BOT can be defined as an application that is specifically designed to perform a specific


automated task. Some examples of BOTs are Cortana (Microsoft), and Clippy (Microsoft).

Fig 2.17: Clippy, an early Microsoft office assistant

27
Though BOTs are capable of performing helpful tasks, many malicious Notes for the
BOTs exist which can install a virus onto your computer and cause great educator
damage. However, we will be studying CHATBOTs that are designed to
The educator can
hold conversations with a human and help by providing them with the
explain how the BOT
desired information they require in the most accurate manner possible. framework works. More
on the BOT Framework
What can a BOT do? can be found here:

CHATBOTs exist that can order food, write an email for us, set alarms, https://docs.microsoft.c
om/en-us/azure/bot-
tell us about our current finances, help save us money, shop for us, and service/bot-builder-
find tourist spots, monuments and even restaurants close by to our basics?view=azure-
current location. For example, Digit is a CHATBOT that helps manage bot-service-
expenses by showing you your bank balance and upcoming bills and 4.0&tabs=csharp
tries to help save you money by offering you relevant financial services
via text messages. Another example is the ‘WHO CHATBOT’ available
in Microsoft Teams, which searches for anyone inside your
organization, or Calendar BOT, also in Microsoft Teams, which helps
you find the best date and time to meet with the people you want to
by checking your diary against the other person’s calendar.

28
The BOT Framework

Fig 2.18: A Generic BOT Framework

A generic BOT framework consists of a communication channel between the user and the
CHATBOT. The CHATBOT translates the text written by the user using natural language
processing (NLP) and translates it into a language that is understood by the machines.
The CHATBOT then communicates with a cloud service to fetch the necessary response and
translates it from machine language to the natural language of the user and communicates the
response to them.

29
Data Labeling
One of the most important requisites of supervised learning is the labeling of data. With
artificial intelligence having more of an impact on our daily routines, there is a constant need
to upgrade the machines in order to continue to provide results with ever enhanced accuracy.
To accomplish this the data input into the algorithm must be precisely labeled.
Look at the image in Figure 2.19 closely. What do you see? In it, unlabeled data gives a learning
machine no information about what is in the image.

Fig 2.19: Unlabeled data is difficult for Machine Learning

As such the machine cannot learn much about it and therefore the outcomes are inaccurate.
From a machine point of view the need for accurately labeled data is of the utmost priority in
order for it to understand what the image shows, what is written in a piece of text, and even
what a sound recording contains.

30
In the subsequent image (Figure 2.20), the data has been labeled. The machine can now easily
identify what is understood from the image and can find similar patterns from other images
when fed similar data. Data labeling is a process that involves putting electronic boundary
boxes on image files and tagging them with keywords that are both related and relevant to the
item within the boundary. It can also involve many other processes such as marking a human
face with points to analyze facial features for use in person identification search engines such
as those used by the police. Another important aspect is the categorization of texts, audio files
and videos, based on their content. In our example, the tag would be a ‘car’ as the traffic image
shows many varieties of vehicles including cars, mini trucks, open vans, two-wheelers, buses
etc.

Fig 2.20: Labeled data is easy for Machine Learning

As mentioned earlier, labeling of data may also involve the identification and marking of
certain points on the face such as the nose, eyes etc. Data marked like this is done from various
angles in order that the machine can recognize the human face more appropriately. The
labeling occurs repetitively in the image, a car in our example, is done to teach the machine
that the label applies to the car irrespective of how it looks, from what position, what color, and
the angle the image of the car was captured.
Similarly, the machine needs to learn the analysis of both text and the sentiments in order to
produce accurate textual outcomes. In a text scenario, the natural language is structured in
such a manner that the algorithm can understand and compute the relevant meaning of the
text. For spoken text, the machine needs to understand not only the word but also the tone
and context in which the word is spoken to correctly gauge the true meaning of the word and
attribute the same to an emotion.

31
Machine Learning and CHATBOTs Notes for the
educator
Robots and Humans The educator can
explain the need for
With advancements in the field of science and robotics, we have slowly robots to take over
entered an era where robots can be found doing many tasks both at certain tasks from
work and in our personal lives. They can now be found doing daily humans where it can
household chores such as vacuuming, driving vehicles, disarming be a great risk to
human life. The same
bombs, controlling artificial (prosthetic) limbs, support surgical task however can be
procedures, manufacture products, entertain, teach, and a lot more. accomplished by a
Why do you think robots are being created to perform certain kinds of robot with more
accuracy and with no
work when traditionally humans have been doing them for years? The
loss of human life.
main reason is that of speed and accuracy, and reduction of threat of Examples can be found
life. A robot can work faster and more efficiently when compared to a in mining, underwater
human; this is why assembly lines use robot machines. The tasks to be exploration, space
performed are routine and do not have unexpected variations. Robots exploration, diffusing
are also being used on farms to help farmers with the removal of bombs, warfare etc..
weeds or unwanted plants from the field. Another use of robotics is to Note that robots are
minimize human errors when performing tasks. not created to replace
humans. The educator
can prompt the class to
answer the following
questions or
questions, that can be
asked to drive an
understanding of the
topic.
How many of the
students are aware that
many miners lose their
life whilst getting stuck
underground? Deep
sea divers lose their
lives due to their
bodies inability to
handle the water
pressure beyond a
certain depth?

32
BOTS – Redefining Workplaces Notes for the
educator
The educator can
discuss the theory of
BOT creation and the
various steps involved
in creating them.
Discuss with students
the various steps in the
creation of a BOT and
their importance. What
would happen if one of
the steps are removed
or not executed? What
effect will it have on
the final product?
The educator can show
Fig 2.21: Conversational BOTs at Workplaces
the below image to the
Image credit: https://chatbotsmagazine.com/7-types-of-bots-8e1846535698
students and ask what
they understand from
A BOT is a computer program that performs automatic repetitive tasks. it. Ask if they have ever
It also acts as the primary tool for automating interactions with website seen it used and
content on a large scale. BOTs on the Internet is not a new concept and where?
has been around for many years. BOT software is easy to implement
and can serve a variety of purposes.

The Science behind the Generation of a BOT

Fig 2.22: The Lifecycle of a Bot

33
Building blocks - Program for a BOT
Building a BOT may appear to be quite intimidating. However, it can be made quite easy
thanks to a variety of tools and techniques. Here are the few general steps required to create a
BOT.
Stage 1: Requirements
Be aware of who the target user is, what their concerns are, and what benefits the solution is to
deliver. Gathering these market requirements is the first step towards the creation of a BOT.
Stage 2: Spec (Specification)
This is the product specification for the creation of a BOT. It puts down in writing the features
and necessary functionality required of the BOT. These features should be identified in Stage 1.
Note - The spec must also include a short and long description of the BOT along with other
things that will be required later at the publishing stage.
Stage 3: Script
This stage is used to build conversational scripts that represent user interactions with the BOT.
The scripts must represent actual user conversations.
Stage 4: Architect
Once the above-mentioned steps are done, the next step is to produce the engineering design
for the creation of the BOT which includes both the front-end and back-end components. The
front-end refers to the conversational interface, whilst the backend refers to computations
performed by the BOT as well as interactions to other web services.
Stage 5: Development
In this stage, BOT developers use an iterative coding and testing development process. As soon
as the BOT is coded to handle a specific set of conversational statements, it’s good practice to
test the code via the messaging interface. It is also a good habit to insert tracking probes into
the BOT to ‘Track’ various stages of its implementation.
Stage 6: Test
The testing process is deeply intertwined with the development process. The code must be
tested not just in a BOT emulator, but also in the actual messaging platform in a real-time
scenario.
Stage 7: Deploy
Once the BOT is built it must be deployed to the hosted environment. The hosted environment
must be stable and needs to offer monitoring and development support.

34
Demonstration
Notes for the educator
The educator can conduct this demonstration with students using the video link. This is an interesting
demonstration to show how LUIS (Language Understanding Intelligent Service) works. Each of the steps
described ensures that all students, irrespective of their levels of understanding, should find the demo
interesting as well as easy to perform if they had to do it themselves. They need to appreciate that the key to
the creation and working of a successful AI operated machine learning is to use LUIS.

Fig 2.23: LUIS in action

Click on the link to watch the video demonstration: https://youtu.be/9tdkIQ-nkdo

35
Introduction to Supervised Machine Notes for the
educator
Learning The educator can teach
students the various
Major machine learning methods types of machine
learning and the
There are many widely adopted machine learning methods such as different scenarios in
supervised and unsupervised learning. which each of the
learning machine can
Supervised learning be used.
It would be a good
Supervised Learning algorithms are trained using labeled examples, time to prompt
from an input where the desired output is known. The learning students to come up
algorithm receives a collection of inputs and the corresponding correct with various
outputs and learns by comparisons between its actual output and applications of AI that
previous correct outputs in order to identify errors, and modifies its they can think of as
model accordingly. Using strategies such as classification, regression, being used in daily life.
The suggested answers
prediction and others, supervised learning uses patterns to predict the could be music, social
values of the label on unlabeled data. Supervised learning is usually networking and e-
employed in applications wherever historical knowledge can easily commerce sites with
predict future events. predictive
recommendations.
There is another form of machine learning known as unsupervised
machine learning. In this form of machine-learning there are no labeled The educator can use
examples and the outcome is unknown. the above-mentioned
examples of real-life
problems or could ask
students to come up
with new ones. Ask
students to mention
the kind of learning
machine that could be
used to devise
solutions for each
Fig 2.24: Pictorial representation of Supervised Learning problem.

36
Semi-Supervised learning
This type of learning is used for the same types of applications as supervised learning, but uses
a mix of labeled and unlabeled data. Semi-supervised learning is useful when the cost
associated with labeling is too high to allow for a fully labeled training process. Early samples
of this is seen in systems that distinguish an individual's face on a web-based camera from
other faces.

Fig 2.25: Pictorial representation of Semi-supervised Learning

Reinforcement learning
This form of learning is used in robotics, gaming and navigation applications. In reinforcement
learning the algorithm discovers through trial and error which actions yield the greatest
rewards. This type of learning has three primary components: the agent (the learner or
decision-maker), the environment (everything the agent interacts with) and actions (what the
agent can do).
The objective is for the agent to choose actions that maximize the expected reward over a
given amount of time. The agent will reach the goal much faster by following a good strategy.
The goal in reinforcement learning is to learn the best strategy.

Fig 2.26: Pictorial representation of Reinforced Learning

37
Supervised Machine Learning
Notes for the educator
Students can be introduced to a supervised machine in a manner that is understood by all. The educator can
use the following example (though we have tackled supervised learning in previous classes, we reiterate it to
bring all students up to the same level in terms of knowledge).
Imagine you have a younger sister. Your sister has never seen any animals in real life and it is your objective
to teach her that there are different types of animals in the world and she has to learn as many as she can.
The educator can ask the student how they would accomplish this objective.
Do you visit your sister at home, bring her a few pets, tell her the type they are, and explain anything she
wants to know about them? Or, do you take her to a jungle where there are many animals for you to explain
what kind of animal they are, and let her figure out for herself that there are different kinds? You don’t have
to choose one of the above. Any choice of environment you make will likely fall into a similar category to
one of the above. For example, taking her to the zoo falls in the same category as the jungle. Even if you
name every animal for her, she will be overwhelmed and won’t be able to memorize all of them as well as
she would in a household with pets.
Similarly, taking her to the jungle will enable her to learn the different kinds of animals. She will do this by
finding patterns some you will understand, others you may not. Thus, you can expect her to know what a cat
is and possibly how it acts and what it eats and remember this for the rest of her life. She will also be able to
generalize her knowledge to the outdoors. If she sees a dog on the street, she will know it’s a dog. If she
sees some animal that she didn’t have as a pet, she won’t recognize that kind of animal, but she will know
that it’s nothing she was introduced to (i.e. nor a dog or a cat).
Even though she knows some kinds of animals very well, you can only teach her so much. You just can’t get
her a lion or an elephant as a pet. You also can’t expose her to some insect that even you don’t know
enough about to teach her. This is supervised learning and the same analogy is used in machine learning.
Recap what was learned about supervised machine learning in the earlier classes.
Supervised machine learning is said to take place when the machine algorithm is provided with
labeled data. This can help the machine understand the data and to generalize a model based
on it. Supervised learning works on ‘training data’ without which the algorithm cannot give the
correct results. The use of training data is to guide the machine learning profile.
The training data consist of a set of training examples. In supervised learning, each example is
a pair consisting of an input object and the desired output value.
A supervised learning algorithm analyzes the training data and produces an function which
infers a result based on the input and past inferences. This is constantly mapped with similar
and new examples. An optimal scenario will allow the algorithm to correctly determine the
class labels for instances never encountered before.

38
Classroom Activity
• Suppose you had a basket filled with a number of different shaped blocks (circle, square,
triangle and rectangular).
• Your task is to arrange them into groups.
• To understand the task first assign names to these shapes.
• We have four types of block called circle, square, triangle and rectangular. But they could be
called anything.
You learn from previous information about the physical characters of the blocks, so arrange
some of the blocks from the basket of the same type together. In data mining terminology the
earlier work is called training the data.
Now take a new block from the basket, note its size and shape and put it in the right group
based on what you have learn from an analysis of previous blocks.
This is Supervised Learning. The dataset you will have used to classify the blocks will be as
follows:

Ser. Description of the block Block Name

1. Round without corners Circle

2. With 4 straight sides of equal length and 4 corners Square

3. With 3 straight sides of equal or unequal length and 3 corners Triangle

4. With 4 straight sides (opposite sides are equal) and 4 corners Rectangle

Table 1: Dataset for Classroom Activity

39
Introduction to Unsupervised Machine Notes for the
educator
Learning The educator can ask
Earlier we had read about various types of machine learning. as well as students as to how
the supervised way of learning there is also an unsupervised machine would they deal with a
learning model, when the machine is not familiar with the data that is scenario where the
machine algorithm
being used as the input. encounters a dataset in
which the data is not
Unsupervised learning labeled. Will the
machine not learn, or
This kind of learning is used with data that has no historical labels and will it ignore the data?
therefore cannot use these to help it learn. However, the dataset may Will a machine not
contain a few data points that are labeled. The algorithm needs to learn from an
compute and analyze based purely on this limited amount of labelled unlabeled dataset?
data available and learn how to automatically label those that are not. Students are expected
The goal is to explore the data and recognize some patterns contained to come up with the
within. answers and the
educator can prompt
Unsupervised learning works well on transactional data. For example, it students to engage in a
can identify customers with similar attributes who can then be treated small discussion on the
similarly in marketing campaigns. Or it can notice the most prominent subject coming to a
attributes that a particular group of customers have. Popular rational consensus if
techniques which use this form of learning include self-organizing possible.
maps, nearest-neighbor mapping, marketing analysis etc.

Fig 2.27: Pictorial representation of Unsupervised Learning

Earlier, we learnt that the machine-based algorithm will learn from datasets which are labeled
in order to provide guidance to the algorithm to compute the answer. As stated earlier, there
can be times when the dataset may not have any labels or may have been partially labeled. In
such a scenario, the machine still learns.

40
Clustering
With no formal guide on the labeling of data points, unsupervised machine learning is
dependent on something called ‘clustering’. Clusters are found by separate algorithms that run
on the machine to identify them only if they exist in the dataset.
Clustering can be defined as the process of dividing the data points into groups in a way that
all the data points in a group share similar properties. The main goal of clustering is to create
groups with data points which have similar traits among them and assign them a label.
A few types of clustering are as follows:
• K–Means Clustering – Involves clustering the data points into K number of clusters. Finding out
the exact number of K clusters is a complex process.
• Hierarchical Clustering – Involves clustering of the data points into parent and child clusters.
• Probabilistic Clustering – Involves clustering of the data points into clusters on the probabilistic
scale that they are identical.
Clustering can be used in a variety of applications. Some of them can be:
• Recommendation engines
• Market segmentation
• Social network analysis
• Search result grouping
• Medical imaging
• Image segmentation
• Anomaly detection

41
Assessments Questions
1. Define data and list examples of data that you can think your school would gather from
others or prepare itself and how it could be used.
Example Answer: Data is distinct information that is formatted in a special way. In general
Data can be used by the school to send out emails, monitor attendance, look for patterns of
bad behavior etc.

2. What do you associate as sentiment in a human being?


Example Answer: An attitude, thought or judgment prompted by feeling.

3. How is machine learning useful in meteorology?


Example Answer: Refer to below links:
https://physicstoday.scitation.org/doi/abs/10.1063/PT.3.4201?journalCode=pto
https://emerj.com/ai-sector-overviews/ai-for-weather-forecasting/

4. How can customer services be improved by using sentiment analysis?


Example Answer: Refer to below links:
https://mashable.com/2010/04/19/sentiment-analysis/

5. Write down the correct term for each of these definitions:


• Countless objects that we use in our day-to-day lives linked together.
(Answer: Internet of Things or IoT)
• Technology that enables the recognition and translation of spoken words into text by the
machines.
(Answer: Speech Recognition)
• Technology that enables a device to comprehend human language in the way, or manner, in
which it is spoken by the humans.
(Answer: Natural language processing)

6. Why Machine learning is the need of today's world?


Example Answer: The need for machine learning is to complete tasks that are too complex
for humans to execute directly. Some are too complex such as the handling and analyzing
of big data. Machine learning is also useful for finding out relationships between things
especially in very large datasets which humans can't process efficiently.

42
7. Give at least 4 real-life examples of how machines are replacing humans.
Example Answers:
• Switchboard operator
• Bowling alley pinsetter
• Lift operator
• Film projectionist
• Bridge toll or car park fee collector
• Check-out cashier
• Railway station ticket seller
• Assembly line worker

8. Using any example from your daily life to explain what you have understood from
unsupervised and supervised learning.
(Subjective Answer)

9. Explain the use of reinforcement learning in gaming.


Example Answer: It is often used for robotics, gaming and navigation. With reinforcement
learning, the algorithm discovers through trial and error which actions yield the greatest
rewards. This type of learning has three primary components: the agent (the learner or
decision maker), the environment (everything the agent interacts with) and actions (what
the agent can do). The goal in reinforcement learning is to learn the best policy.

10. What is the process of marking human face related data points?
Example Answer: Labeling of data involves the identification and marking of certain
points on the face such as the nose, eyes etc. The data is marked from various angles of
the image so that the machine is able to recognize the human face with the help of an
appropriate algorithm.

11. What is Machine Learning?


Example Answer: Machine learning is a specific subset of AI which trains a machine on
how to learn. It is an application of artificial intelligence that provides computer systems
with the ability to spontaneously learn and improve based on its experience without
being explicitly programmed.

43
12. What are the probable applications of Supervised Machine Learning?
Example Answer: Supervised machine learning finds its application in the area of image
classification, Speech Recognition, and number prediction.

13. Why is labeled data easier for a machine to learn from?


Example Answer: The machine finds it easier to learn from labeled data because the
data acts as training data thereby helping the machine better understand the output.

14. Fill in the Blanks


• In Data mining terminology, the earlier work is called as ________.
Answer: training the data.
• Supervised learning uses _____.
Answer: labeled data.
• A generic BOT framework consists of a ________.
Answer: communication channel between the registered user and the chat CHATBOT
• A CHATBOT translates the text written by the user using a ______
Answer: communication channel

15. True or False


• Unlabeled data is less expensive and takes less effort to acquire.
Answer: True.
• Semi supervised learning uses only unlabeled data.
Answer: False
• Reinforcement learning, the algorithm discovers through trial and error which actions yield the
greatest rewards.
Answer: True

44
Questions to consider
• Name the major differences between alarms and reminders.
https://medium.com/truemd/whats-the-difference-between-an-alarm-and-a-reminder-a73c11dc1a73

• What probable challenges will Cortana face when playing a song that has a remake version too?
https://www.howtogeek.com/402579/i-used-a-cortana-smart-speaker-all-weekend.-heres-why-it-failed/

• Probe the various aspects of the debate of ‘human vs. BOT interaction’.
https://www.intercom.com/blog/bots-versus-humans/
https://www.retaildive.com/news/70-of-consumers-still-want-human-interaction-versus-bots/543324/
https://cyfuture.com/blog/the-great-bot-battle-ai-chatbots-vs-human-powered-live-chat

• Investigate the various data labeling approaches and find out the pros and cons with suitable
examples.
https://www.kdnuggets.com/2018/05/data-labeling-machine-learning.html

• Can a BOT replace humans?


https://www.bbntimes.com/en/technology/the-rise-of-chatbots-will-they-replace-humans

• Can BOTs turn malicious?


https://www.webroot.com/us/en/resources/tips-articles/what-are-bots-botnets-and-zombies
https://www.symantec.com/blogs/feature-stories/malicious-bot-attacks-why-theyre-more-dangerous-
ever

• Elaborate the steps for data protection? Name the steps to annotate the data?
https://www.richardsandrichards.com/6-steps-to-complete-data-protection-for-your-small-business/
https://resources.infosecinstitute.com/how-to-implement-a-data-privacy-strategy-10-steps/#gref
https://medium.com/thelaunchpad/spinning-up-an-annotation-team-c74c6765531b

• Deduce how much data is required for analysis by the machine?


https://machinelearningmastery.com/much-training-data-required-machine-learning/
https://towardsdatascience.com/how-do-you-know-you-have-enough-training-data-ad9b1fd679ee

• Find out communities which give free data for research?


https://www.nature.com/sdata/policies/repositories

• Design steps for ‘Tic Tac Toe’ so that winning chances of computer is maximum.
https://www.wikihow.com/Win-at-Tic-Tac-Toe

45
• Calculate the best moves to solve the ‘Tower of Hanoi’ problem.
https://en.wikipedia.org/wiki/Tower_of_Hanoi

• Write down the different appliances that generate datasets, in a smart home
o Security camera related data
o Thermostat related data
o Electricity consumption data

46
Some Practical Assignments/Lab Work
Assignment 1
Use the Bing search engine to prepare a report on the following.
A. Types of machine learning.
B. Heuristic search in AI.
C. Knowledge representation technique.
D. AI based games for competitive entertainment such as Chess.
Assignment 2
Outline the challenges that can be encountered in problem identification.
https://www.toolshero.com/problem-solving/problem-definition-process/
Assignment 3
Evaluate the importance and role of an identifier in the dataset.
https://www.ngdc.noaa.gov/wiki/index.php/Data_Set_Identifiers_and_other_Unique_IDs
https://www.dataone.org/best-practices/provide-identifier-dataset-used
Assignment 4
Create datasets for the following:
• Image processing – Create a dataset of at least 100 images of natural scenes
• Sentiment Analysis – Chose a product of your choice and search more than one ecommerce
website. Write down all the reviews (not less than 100) written for the product in a document
under the website name.
• Video processing – Shoot or download birthday party videos (not less than 50) and collect
them in a folder.
• IoT – Download (any three) IoT data online. Suggested – Weather data, traffics data, agriculture
data and smart phone data.

47
Assignment 5
Notes for the educator
The educator can discuss with students regarding the possible questions that would prompt the use to
provide the necessary data to the machine.
Suggested questions for different scenarios:
Club membership enquiry
• What is your age (check for eligibility)?
• What games do you play?
• What are your play timings?
• Duration for membership – quarterly, bi-monthly, half yearly or yearly?
Career guidance or higher education
• Percentage scored in class 10th exams?
• Present options for subjects?
What is the user preference – (present streams – science, arts, commerce, commerce with
mathematics etc.)
Design a BOT interaction possible and questions and answers based on the same for Club
membership enquiry scenario.
Design a BOT interaction possible questions and answers for career guidance or higher
education in AI scenario.
Assignment 6
Notes for the educator
The educator can help the students in preparing datasets. A sample for each kind of written communication
can be prepared for students to get inspiration from and create datasets using it as a reference. It is
suggested that the educator helps the students create labels for the machine to identify the type of
communication. Based on these labels, as well as the various keywords that will have to be mapped against
the labels, the machine will be able to differentiate and segregate the communication received.
Suggested keywords
• Staff communication mails – staff, teacher, educator, subject, class teacher, discipline,
permission, class, etc.
• Parent complaint mails – ward, mother, father, guardian, class, student, complaint, unaware
etc.
• Educational bodies mails – authority, board, school, inspection, requirements etc.
• Vendor mails – vendor, issue, payment, permission, principal, office, dated etc.
• Co-Curricular notification mails – notification, circular, school, district, state, level,
competitions, class, participation, participate etc.

48
It is to be noted that there could be certain keywords that would be common to more than one
communication type. The machine is expected to focus on both similar and dissimilar keywords
and labels to identify and segregate.
Consider the scenario of a school where the principal needs help from a machine-based
application with mail segmentation. Students are to consider segregation of the mails in the
following categories
• Staff communication mails
• Parent complaint mails
• Educational bodies mails
• Vendor mails
• Co-Curricular notification mails

Notes for the educator


The educator can help the class formulate different datasets that are to be used to complete the task.
Identify the problem, the datasets and devise a solution keeping in mind the ethical issues and the privacy
concerns regarding data collection and processing.

Assignment 7
Prepare a detailed report on how machines develop intelligence and learn from reinforcement
methodology in a game of chess.
https://www.infoworld.com/article/3400876/reinforcement-learning-explained.html
Assignment 8
Design a student’s assistance program for students with low performance. How can AI assist in
identifying the weak students?
(Subjective answer)

49
Assignment 9
Refer to the website below to understand the IRIS dataset and answer the following questions.
https://archive.ics.uci.edu/ml/datasets/iris
A. What are the features/attributes of the dataset?
B. What are the targets/classes of the dataset?
C. How many rows are there in the dataset?
D. Are there any missing values in the dataset?
E. Is the data univariate or multivariate?
F. If we follow 60:20:20 pattern for train, validate and test, how many rows will be there in each of
the dataset?

Assignment 10
Perform classification of students to understand who would be interested in joining the sports
club of the school.
Lab Session -1: Create a dataset or use any existing free dataset
Lab Session -2: Study the dataset of the students
Lab Session -3: Dataset should include:
• Student ID/Admission number
• Interest in the sports (name of the sport in which the student is interested to participate)
• Achievement in the sports
• Academic scores
• Distance from school to home
• Height and shoe size

50
Practical Assignments
Assignment 1
Imagine a situation at home where your family is expecting guests. You have lights at various
locations both indoors and outdoors. There are lights at the doorway, near the gate, along the
pathway, in the garden and also in the interior rooms of the house. Each family member has a
different option about which light to switch on. Write down the problem statement and
alternative solutions.
Assignment 2
Imagine a hypothetical situation where you are looking forward for various applicable career
choices with a help of a CHATBOT assisting you in the process. Prepare a set of
question/answer trails.
Assignment 3
Create a bank of images (more than 50). It should contain images of people with emotions
(various ages, color, expressions etc.). Using Microsoft’s online ‘Face and Emotion Recognition’
application, run the images to predict the emotion and analyze visual content.
https://aidemos.microsoft.com/face-recognition

51
Further Reading
• https://www.forbes.com/sites/willemsundbladeurope/2018/10/18/data-is-the-foundation-for-artificial-
intelligence-and-machine-learning/#3eccba5251b4
• https://towardsdatascience.com/role-of-data-science-in-artificial-intelligence-950efedd2579
• http://www.dbta.com/BigDataQuarterly/Articles/The-Importance-of-Data-for-Applications-and-AI-
129316.aspx
• https://www.technative.io/data-quality-vs-data-quantity-whats-more-important-for-ai/
• https://pjreddie.com/darknet/yolo/
• https://www.houseofbots.com/news-detail/3581-4-understand-the-machine-learning-from-scratch-for-
beginners
• https://www.minigranth.com/artificial-intelligence/problem-solving-in-artificial-intelligence/
• Problem Solving in Artificial Intelligence by Prof Philippe Codognet Link -
http://webia.lip6.fr/~codognet/PSAI/1-introduction.pdf
• Introduction to Artificial Intelligence: Problem Solving and Search by by Berhard Beckert 2004. Link -
https://formal.iti.kit.edu/~beckert/teaching/Einfuehrung-KI-WS0304/04ProblemSolving.pdf
• Learning problem solving (artificial intelligence, machine) by Bruce Walter Porter by University of
California, Irvine 1984.
• Learning problem solving strategies using refinement and macro generation by HA Güvenir, GW Ernst,
Elsevier Science Publishers B.V. (North-Holland) 1990. Link -
http://repository.bilkent.edu.tr/bitstream/handle/11693/26215/bilkent-research-
paper.pdf?sequence=1&isAllowed=y
• Microsoft Power BI Dashboards Step by Step 1st Edition by Errin O'Connor
• The 5 Clustering Algorithms Data Scientists Need to Know - https://towardsdatascience.com/the-5-
clustering-algorithms-data-scientists-need-to-know-a36d136ef68
• Clustering Introduction & different methods of clustering -
https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-
clustering/
• What is Data Labeling? - https://www.youtube.com/watch?v=_BasmAAub7w
• Why Smart Labeling is the Future of Data Annotation - https://www.youtube.com/watch?v=V33Ut36eUsY
• Four Mistakes You Make When Labeling Data - https://towardsdatascience.com/four-mistakes-you-make-
when-labeling-data-7e431c4438a2
• Practical Machine learning problems - https://machinelearningmastery.com/practical-machine-learning-
problems/
• https://www.messengerpeople.com/chatbots-what-is-a-whatsapp-bot-actually/
• https://www.geeksforgeeks.org/what-is-reinforcement-learning/
• https://deepsense.ai/what-is-reinforcement-learning-the-complete-guide/

52
Reference Links
• Algorithmia (2018). Introduction to Unsupervised Learning | Algorithmia Blog. [online]
Algorithmia Blog. Available at: https://blog.algorithmia.com/introduction-to-unsupervised-
learning/ [Accessed 10 Sep. 2019].
• Al-Masri, A. (2019). What Are Supervised and Unsupervised Learning in Machine Learning?
[online] Medium. Available at: https://towardsdatascience.com/what-are-supervised-and-
unsupervised-learning-in-machine-learning-dc76bd67795d [Accessed 6 Sep. 2019].
• Author (2019). Data labeling service: training data for machine learning | Clickworker. [online]
Clickworker.com. Available at: https://www.clickworker.com/crowdsourcing-glossary/data-
labeling/ [Accessed 6 Sep. 2019].
• Automationanywhere.com. (2019). TAKE CHARGE OF THE BOT LIFECYCLE. [online] Available at:
https://www.automationanywhere.com/in/solutions/enterprise-bot-lifecycle-management
[Accessed 12 Jul. 2019].
• Brownlee, J. (2015). Basic Concepts in Machine Learning. [online] Machine Learning Mastery.
Available at: https://machinelearningmastery.com/basic-concepts-in-machine-learning/
[Accessed 29 Jun. 2019].
• Chen, J. (2019). Neural Network Definition. [online] Investopedia. Available at:
https://www.investopedia.com/terms/n/neuralnetwork.asp [Accessed 27 Sep. 2019].
• Chris, (2009) How To Write A Problem Statement | Ceptara. 2009. How To Write A Problem
Statement | Ceptara. [online] Available at: http://www.ceptara.com/blog/how-to-write-problem-
statement. [Accessed 04 July 2019].
• Decypher. (2018). Machine Learning: What it is and Why it Matters - Decypher. [online] Available
at: https://www.decypher.com/machine-learning-matters/ [Accessed 4 Jul. 2019].
• Dietrich, D., Heller, B. and Yang, B. (2015). Data Science & Big Data Analytics: Discovering,
Analyzing, Visualizing and Presenting Data. [ebook] Indianapolis: John Wiley & Sons, Inc., pp.29-
30. Available at: http://index-of.co.uk/Big-Data-
Technologies/Data%20Science%20and%20Big%20Data%20Analytics.pdf [Accessed 13 Sep.
2019].
• Guru99team (2019). Supervised Machine Learning: What is, Algorithms, Example. [online]
Guru99.com. Available at: https://www.guru99.com/supervised-machine-learning.html [Accessed
6 Sep. 2019].
• Kaushik, S. (2016). Clustering Introduction & different methods of clustering. [online] Analytics
Vidhya. Available at: https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-
clustering-and-different-methods-of-clustering/ [Accessed 10 Sep. 2019].
• Loon, R. (2018). Machine learning explained: Understanding supervised, unsupervised, and
reinforcement learning. [online] Big Data Made Simple. Available at: https://bigdata-
madesimple.com/machine-learning-explained-understanding-supervised-unsupervised-and-
reinforcement-learning/ [Accessed 6 Sep. 2019].

53
• Maj, M. (2019). Object Detection and Image Classification with YOLO. [online] Kdnuggets.com.
Available at: https://www.kdnuggets.com/2018/09/object-detection-image-classification-
yolo.html [Accessed 29 Jun. 2019].
• McFadin, P. (2019). Internet of Things: Where Does the Data Go? [Online] WIRED. Available at:
https://www.wired.com/insights/2015/03/internet-things-data-go/ [Accessed 15 Nov. 2019].
• Sarah Mitroff. 2019. What is a BOT? - CNET. [ONLINE] Available at: https://www.cnet.com/how-
to/what-is-a-bot/. [Accessed 05 July 2019].
• Sheth, B. (2016). The BOT Lifecycle. [online] CHATBOTs Magazine. Available at:
https://chatbotsmagazine.com/the-bot-lifecycle-1ff357430db7 [Accessed 12 Jul. 2019].
• Shoemaker, C. (2019). IoT Data: How to Collect, Process, and Analyze Them. [Online] Tech.
Available at: https://it.toolbox.com/blogs/carmashoemaker/iot-data-how-to-collect-process-
and-analyze-them-032619 [Accessed 15 Nov. 2019].
• Simmons, D. (2019). Pushing IoT Data Gathering, Analysis, and Response to the Edge - DZone IoT.
[Online] dzone.com. Available at: https://dzone.com/articles/pushing-iot-data-gathering-
analysis-and-response-to-the-edge [Accessed 15 Nov. 2019].
• Smith, A. (2018). Understanding Architecture Models of CHATBOT and Response Generation
Mechanisms - DZone AI. [online] dzone.com. Available at:
https://dzone.com/articles/understanding-architecture-models-of-chatbot-and-r [Accessed 12
Jul. 2019].
• University of Bath, (2019) Data access statements - Archiving and sharing data - Library at
University of Bath. 2019. Data access statements - Archiving and sharing data - Library at
University of Bath. [online] Available at: https://library.bath.ac.uk/research-data/archiving-and-
sharing/data-access-statements. [Accessed 04 July 2019].
• University of Nebraska-Lincoln, (2019) Remember the 5 W's | IT Best Practices | Nebraska. 2019.
Remember the 5 W's | IT Best Practices | Nebraska. [online] Available at:
https://its.unl.edu/bestpractices/remember-5-ws. [Accessed 04 July 2019].

54
Glossary
Ancient - belonging to the very distant past and no longer in existence.
Logic - a system or set of principles underlying the arrangements of elements in a computer or
electronic device so as to perform a specified task.
Algorithms - a process or set of rules to be followed in calculations or other problem-solving
operations, especially by a computer.
Perceptions - The way in which something is regarded, understood, or interpreted.
Intervention - The action or process of intervening.
Complex - A group or system of different things that are linked in a close or complicated way;
a network.
Segmentation - Division into separate parts or sections.
Sentiment - Feelings of tenderness, happiness, sadness, or nostalgia.
Emotion - a strong feeling deriving from one's circumstances, mood, or relationships with
others.
Polarity - The state of having two opposite or contradictory tendencies, opinions, or aspects.
Parameter - a numerical or other measurable factor forming one of a set that defines a system
or sets the conditions of its operation.
Non-linear - Not arranged in a straight line.
Crux - The decisive or most important point at issue.
Application - A program or piece of software designed to fulfil a particular purpose.
Data mining - The practice of examining large pre-existing databases in order to generate
new information.
Stakeholder - A person with an interest or concern in something
Narrative - A spoken or written account of connected events.
Untagged - Of a piece of text or data not identified or categorized by a tag.

55

You might also like