0% found this document useful (0 votes)

175 views

Student Guide - Module 2 Machine Learning

This document provides an overview of machine learning and its role as the foundation of artificial intelligence. It discusses how machine learning allows computers to automatically learn and improve from experience without being explicitly programmed. The document also describes how machine learning involves the development of algorithms that can access and learn from data. It provides an example of how search engines use machine learning to analyze past user clicks and determine the most relevant search results.

Uploaded by

dawow

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

175 views

Student Guide - Module 2 Machine Learning

Uploaded by

dawow

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 50

Machine Learning

Module 2
Table of Contents

Learning Objectives...................................................................................................................................... 5
Machine Learning – The foundation of Artificial Intelligence...........................................................6
Machine Learning...................................................................................................................................................................... 7
The Need for Machine Learning.......................................................................................................................................... 8
Understanding Data and Datasets.........................................................................................................10
Data and Its Utility.................................................................................................................................................................. 10
Use of Data in Machine Learning...................................................................................................................................... 12
Different Types of Datasets................................................................................................................................................. 12
Sentiment Analysis................................................................................................................................................................. 14
Design Thinking, Problem Identification, & Working with Data....................................................21
Design Thinking....................................................................................................................................................................... 21
Problem Identification........................................................................................................................................................... 22
Development and Understanding of the BOT Framework...............................................................24
What is a BOT?......................................................................................................................................................................... 24
What can a BOT do?.............................................................................................................................................................. 25
The BOT Framework............................................................................................................................................................... 26
Data Labeling............................................................................................................................................................................ 27
Machine Learning and CHATBOTs.........................................................................................................29
Robots and Humans.............................................................................................................................................................. 29
BOTS – Redefining Workplaces......................................................................................................................................... 29
The Science behind the Generation of a BOT.............................................................................................................. 30
Building blocks - Program for a BOT............................................................................................................................... 30
Demonstration......................................................................................................................................................................... 32
Introduction to Supervised Machine Learning....................................................................................33
Major machine learning methods.................................................................................................................................... 33
Supervised learning................................................................................................................................................................ 33
Semi-Supervised learning.................................................................................................................................................... 33
Reinforcement learning........................................................................................................................................................ 34
Supervised Machine Learning............................................................................................................................................ 35

2
Classroom Activity.................................................................................................................................................................. 35
Introduction to Unsupervised Machine Learning...............................................................................36
Unsupervised learning.......................................................................................................................................................... 36
Clustering................................................................................................................................................................................... 37
Assessments Questions.............................................................................................................................38
Questions to consider................................................................................................................................40
Some Practical Assignments/Lab Work................................................................................................42
Practical Assignments................................................................................................................................46
Further Reading.......................................................................................................................................... 47
Reference Links...........................................................................................................................................48
Glossary......................................................................................................................................................... 50

3
Disclaimer:
The Imagine Cup Junior guides and lesson materials are created by Microsoft and our partners
and intended to be for guidance only to support with the Imagine Cup Junior Challenge. For
the latest on Microsoft AI please visit https://www.microsoft.com/en-us/ai

4
Learning Objectives
Through this module, students will get an overview of machine learning and understand how it
provides the foundation of AI. Students should be able to understand the basics of machine
learning and use the concepts as applied to their daily life.
At the conclusion of the module, students should be able to:
 Understand the basics of machine learning.
 Comprehend the basics of using datasets and working with data.
 Understand the machine learning approach to problem-solving, and devise solutions to
problems.
 Comprehend the latest design-related perspectives, ideas, concepts, and solutions.
 Understand the importance of data and ways to protect it.
 Execute projects using design thinking principles.
 Understand and analyze data related problems.
 Comprehend the basics of creating a BOT and the related working framework.
 Understand the various challenges of creating a BOT.
 Appreciate the similarities and differences between a machine-driven BOT and a human.
 Understand the concept of cluster algorithms and apply the principle of ‘clustering’ on data.


5
Machine Learning – The foundation of Artificial
Intelligence
Often used interchangeably with artificial intelligence, machine learning, however, has a
different meaning. It is the ‘learning’ that the machine derives from its experience in processing
data. The primary objective of machine learning is to ensure that the machine learns from the
data. In other words, machine learning is an application of artificial intelligence (AI) that
provides computer systems with the ability to automatically learn and improve from experience
without being explicitly programmed. Machine learning as a science focuses on the
development of computer programs that can access data and use it to learn for themselves.
This is sometimes known as heuristic programming.
Machine learning can also be defined as the study of computer-based algorithms designed to
automatically improve the experience through acquired learning. Machines are created with a
built-in capability to read and understand human language to comprehend their surroundings
and make as many accurate predictions as they can. They can also perform simultaneous real-
time assessments of predictions and adapt according to their environment. When a user wants
to search a topic, the search engine shows up the most frequently searched related ‘search
topics’. The search engine looks at past clicks from people around the world in order to
understand the pages that are more relevant for those searches than others. It then serves
those results a list with the most relevant being at the top. It should be noted that such an
exercise is impossible to be performed by humans in the time frame of a few seconds.
The machine learns how to handle search requests and generate a set of instructions to create
the expected outcome. Hence, machine learning can also be understood as a set of
procedures, which deals with huge amounts of data smartly (using algorithms or a set of
logical rules) to derive results.

6
Machine Learning
Whilst you are all, by now, familiar with artificial intelligence (AI), machine learning is a specific
subset of AI which simply trains a machine on how to learn. It is an application of artificial
intelligence that provides computer systems with the ability to spontaneously learn and
improve based on its experience without being explicitly programmed.

The process of machine learning begins with analyzing observations (data) such as examples,
direct experiences, or instructions, and looks for patterns in the data. Based on this analysis and
the cumulative data it was provided, it learns to make better decisions in the future. The
primary goal is to allow the machine to learn automatically without human intervention or
assistance and adjust actions accordingly.

Fig 2.1: Phase1 of Machine Learning

Fig 2.2: Phase 2 of Machine Learning

7
The Need for Machine Learning
The reason behind machine learning is to automate mundane tasks to the extent that the
machine can learn, think and make smart decisions on its own. It is also to minimize human
interference and thus bias in various scenarios. The need for machine learning is to complete
tasks that are too complex for humans to code computers for directly. Some tasks are so
complex that it is impractical, if not impossible, for humans to cater for all the nuances and
code for every single instance separately. Instead, a large amount of data is provided to a
machine learning algorithm and the algorithm computes the result by exploring the data and
constructing a model that will achieve the desired outcome.
Machine learning is also useful for finding relationships between things, especially in
exceptionally large datasets which are too big for humans to process efficiently. Its uses here
are in object recognition, marketing analytics, analyzing scientific data in labs, and numerous
other applications that involve large amounts of data needing to be analyzed.

Fig 2.3:

The key difference between traditional programming and machine learning is that in traditional
programming, the data and the program are run on the computer to produce an output.
However, in machine learning, the data and the output are fed into a computer to create a
program. This program can be then used in the same way as one created by traditional
programming.
A few examples of Machine Learning in our day-to-day life are:
 Cortana

8
 Refined search engine results (as represented in fig 2.4)

Fig 2.4: Search Engine

Result Refining, a
typical example of
machine learning

9
Understanding Data and Datasets
Data and Its Utility
This is Datum Fig
2.5:

Data
Data can be defined as the collection of facts, numbers or other information that are used
either for reference, or analysis. The singular form of Data is Datum.
In Figure 2.5, the task is to compute the cumulative percentage of each student’s marks. The
percentage obtained in an exam by a student is calculated by the sum of all the marks
obtained in different subjects divided by the number of subjects. Therefore, the marks are
important for calculating the percentage, and to arrive at the result it is important to take into
account the marks obtained by each student in each subject.

10
Dataset
A dataset is defined as a collection, or group, of data where every column denotes a particular
variable and each row relates to a specific member of that dataset.

This is a Dataset

Fig 2.6: A Collection of Data is called a Dataset

Datasets are needed to create the learning algorithm the machine uses in a particular context
of Artificial Intelligence. The method adopted by machines to learn is to use automatic data
analysis for building concepts. The whole model of machine learning is built on the premise
that systems can be programmed to learn from the data they receive as input. This is done
through the identification of patterns to make informed decisions with minimal human
intervention.
The entire process begins with the input of data into the machine. Data can be accumulated
either through datasets or real-time data through physical sensors such as cameras,
temperature sensors, microphones etc. The machine is equipped to understand and analyze
patterns and then perform certain tasks using those patterns as references. The machine works
iteratively, which as the model is exposed to various kinds of data makes it capable of adapting
itself independently.
The machine learning, like humans, comes from earlier results on similar scenarios and thus the
application learning improves the more it is used. This leads to more trustworthy results, but
remember, the trustworthiness of the output is only as robust as the datasets it has been
provided with over time.

11
Use of Data in Machine Learning
Data can be in the form of text, images, numbers, and even sound or video. The datasets are
analyzed to create an experience which in turn is used to create a form of machine learning
program.

Fig 2.7: Machine Learning

Image Source - https://quantdare.com/machine-learning-a-brief-breakdown/

Different Types of Datasets

Different kinds of datasets are used to achieve a particular machine learning objective. Here
are a few of them:
 Image Processing
 Sentiment Analysis
 Natural Language Processing
 Video Processing
 Speech Recognition
 Internet of Things (IoT)

12
Image Processing
Datasets for image processing can be used for object captioning, detection and segmentation
of the dataset (Maj, 2019).

Fig 2.8: Datasets for image processing

Image Source - https://www.kdnuggets.com/2018/09/object-detection-image-classification-yolo.html

Fig 2.9: Segmentation and Captioning through Machine Learning

Image Source - https://towardsdatascience.com/faster-r-cnn-object-detection-implemented-by-keras-for-custom-
data-from- a browsers-open-images-125f62b9141a

A Dataset of a variety of facial expressions is used to understand expression and caption the
image accordingly. Such as a happy or sad face.

13
Sentiment Analysis
Whilst one parameter may be Object Recognition, another is that of the human sentiment. This
layer of ‘sentiment analysis’ when put into context can categorize the various human emotions
as a datatype and its intensity. The algorithms used for the analysis of the human sentiments
are advanced and designed to generate accurate and useful results. Examples of the use of this
can be to analyze the sentiment of a customer.

Fig 2.10: Emotions depicted in a smiley

By analyzing sentiment accurately, and in particular when people are unhappy, the application
can focus on actions that could alter the person’s emotion. This could be used for good in
supporting people with certain mental illnesses, or for a not so good purpose such as
convincing someone to purchase a product they may not wish too. Remember it is not the
technology but how it is used that is important.

14
Fig 2.11: Steps of Sentiment Analysis of an Ecommerce Platform
How is this achieved? Two `Polarity’ nodes are created; one for a positive sentiment, the other
for a negative sentiment. This is done to assist in identifying the right sentiment of a person.
The words associated with a given polarity node are then re-submitted to the algorithm for
more accurate sentiment analysis.

Fig 2.12: Polarity Nodes for Sentiment Analysis

Natural Language Processing (NLP)
NLP uses language comprehension to train the machine to adapt to natural changes in
language such as the addition of new words, editing of words to suit the current context,
15
labeling new words according to usage, and deleting words that have become obscure and no
longer in use.
Natural language processing can be defined as a technology which enables the machine to
comprehend human language in the way it is spoken and understood by humans. One of the
most striking aspects is that many natural languages such as English, French, German, and
Mandarin Chinese, etc. keep evolving and the learning element of the machines adapts to this.
There is no fixed or permanent structure to language, thus making it very flexible. Different
kinds of dialects and sub-dialects are spoken in different regions and each one of these is
slightly different from the other. This includes the use of slang and other urban or sub-cultural
languages.

Fig 2.13: Steps of Sentiment Analysis of a Yammer Discussion

16
Fig 2.14: Example of Sentiment Analysis of a Yammer discussion
In Fig 2.14, the AI software takes words from a micro blogging website and associates them
with a particular sentiment. This enables it to identify the overall tone of the conversation.

17
Video Processing
In this example the AI Software takes screenshots at regular intervals from a live video stream
and analyses it to count the number of people who are about to get onto the bus. By
recognizing other objects it can also calculate other parameters such as the frequency of buses
arriving at the bus stop, or automatically identifying crowd rush hours across the day. This data
can then be used to manage more efficient public transport.

Fig 2.15: Video Processing

Speech Recognition
Speech recognition can be defined as a technology which enables the recognition of the
spoken word and subsequent translation into text. The machine learns ways of identifying and
analyzing various human voices, both live and recorded, and processes them accordingly, such
as the conversion into text for dictation purposes in word-processors such as Microsoft Word.
It is also used to understand the user in voice-activated modules in automated cars, and in the
important role of assisting those people with disabilities.
Internet of Things (IoT)
The Internet of Things (IoT) makes reference to the countless networked devices we use to
make our lives easier. These devices rely on the Internet to gather and share data from
responsible sources in order to provide ‘smart’ services. With these huge datasets and massive
amounts of data sources becoming a reality, machine learning has become an integral part of
our daily lives.

18
Machine learning can be applied in almost all scenarios where the outcome is known. It can
however also be applied where the datasets are unknown and in situations where there are
repeated forms of the same sort of data which can be used to reinforce the machine learning.
For example, machine learning can help in understanding and analyzing the patterns of waves
and oceanic currents in order to predict future sea temperatures, monsoon patterns, and even
the potential for a cyclone or other natural disaster in a specific geographical location.
Capturing IoT and Sensor Data
The Internet of Things (IoT) is more of a concept than an actual thing. The concept is to allow
us to interpret data from networked sensors or devices in the most meaningful ways possible.
The aim is to measure, analyze, visualize, predict, and react to the data accumulated from these
sensors. One form of IoT most people are familiar with is a smart thermostat, smart switches, or
other internet-connected devices and appliances in your house. These are generally considered
part of Consumer IoT. Then there's Industrial IoT, or IIoT. This includes things like the use of IoT
devices in smart buildings, industrial automation, and monitoring of industrial processes.
Processing IoT Data
Processing the data from connected IoT sensors requires time and many interactions with sub-
procedures such as:
 Standardizing or transforming the data into a uniform format to ensure it is compatible with
your application.
 Creating and Storing a backup of the newly transformed data.
 Removing any repetitive, outdated, or unwanted data to help improve accuracy.
 Integration with additional structured (or unstructured) data from other sources to help enrich
the dataset.
IoT Data Analytics
When we apply data analysis tools or procedures to different types of IoT data, the process is
called IoT analytics. This process is performed on huge datasets to improve the efficiency of
procedures, applications, business processes, and production. Several types of data analytics
can be used on IoT data:
Prescriptive analytics
Prescriptive analytics is used to analyze what steps to take in a specific situation. It’s often
described as being a combination of descriptive and predictive analysis. When used in
commercial applications, prescriptive analytics helps decipher large amounts of information to
obtain more precise conclusions.
Spatial analytics
This is used to analyze location based data. Spatial analytics deciphers various geographic
patterns, determining any type of spatial relationship between various physical objects. Parking

19
applications, smart cars, and crop management are all examples of applications that benefit
from spatial analytics.
Streaming analytics
Streaming analytics, sometimes referred to as event stream processing, is the analysis of
massive datasets of moving images. These real-time data streams can be analyzed to detect
emergency or urgent situations, facilitating an immediate response. The types of IoT
applications that benefit from streaming analytics include those used in traffic analysis and air
traffic control, and CCTV by Police.
Time series analytics
Time series analytics is based on time-based data, which is analyzed to show any anomalies,
patterns, or trends. Two systems that greatly benefit from time series analytics are health and
weather-monitoring systems.
We are surrounded by IoT data in our homes, our cars, and in our schools. The amount of data
that IoT technology produces is massive. By collecting, processing, and analyzing this data, we
can gain valuable insights to help us make better decisions about their future.
The following links give access to free datasets of IoT and sensor-based data for you to
download.
 https://data.world/datasets/iot
 https://hub.packtpub.com/25-datasets-deep-learning-iot/
 https://www.kaggle.com/uciml/biomechanical-features-of-orthopedic-patients
 https://www.datasciencecentral.com/profiles/blogs/great-sensor-datasets-to-prepare-your-
next-career-move-in-iot-int


20
Design Thinking, Problem Identification, & Working
with Data
Design Thinking
Design thinking is a non-linear iterative process designed to help understand a problem, the
users it affects, assumptions made, and any available solutions keeping in mind all the
parameters. Design teams are responsible for implementing solutions to problems within a
specific time frame. Design thinking is therefore a problem-solving methodology aimed at
devising solutions.
Over the years, ‘design thinking’ has gained prominence in terms of its effectiveness in solving
problems or finding as many alternative solutions as possible. Organizations such as Microsoft
use it successfully to design and create products. Other organizations such as Universities,
Banks and other companies also use it to help solve real-life problems in their industry sectors.
The various stages of design thinking are a part of an iterative process where the ultimate
objective is to acquire an in-depth understanding of the problem and suggest a single solution
or alternatives within specific boundaries.
Stage 1: Empathize—Research Your User’s Needs
This stage permits the team to gain an understanding of the problem, and to what extent it
affects the user. This step requires gaining an empathetic understanding of the user’s issues.
Empathy, or the ability to imagine oneself in the condition of another, is vital in any human-
centric design process as the process is not to be based on the team’s assumptions but in
terms of the user’s perspective.
Stage 2: Define—State Your User’s Needs and Problems
In this stage, information is accumulated from the previous stage for analysis. The team makes
its observations and a definition of the core problem is written based on these aspects. This is
known as the Problem Statement and the most important aspect of this is to understand that
the problem statement should be as human-centric as possible. We will teach you how to
create a Problem Statement in depth later in the course.
Stage 3: Ideate—Challenge Assumptions and Create Ideas
Once the design team enters this third stage; they completely understand the user, the matters
of concern to the user, and have defined the problem as exactly as they can. Now is the time
for the team to come up with innovative ideas by thinking creatively and ‘outside-the-box’.
Stage 4: Prototype—Start to Create Solutions
This next stage is an experimental phase with the sole objective of identifying the best possible
solution that can be provided to solve the problem. The aim is to find solutions that are
possible, inexpensive, and achievable.
21
Stage 5: Test—Try Your Solutions Out
At this final stage the team test their solutions to check their feasibility and recommend the
best possible one. As an iterative process, the results obtained may be used to redefine one or
more of the solutions identified by choosing to return to previous stages in the process in
order to make further changes, refinements, or to rule out particular alternative solutions.

Problem Identification
Creating the Problem Statement
A problem statement should be designed to address the Five Ws (Who, What, When, Where
and Why). A simple and well-defined problem statement is often used by every member of a
project team to understand the problem and work together toward developing a solution.
The reason for the existence of a problem statement is the identification and explanation of the
problem itself and as such includes a description of the environment where the problem exists,
and the impact that it has on other elements such as users, finances, resource allocation, or
additional activities. A problem statement also explains the anticipated environment the
solution is to run in. This definition of the problem helps create a holistic overview and to
define the problem in an elaborate but clear manner. Furthermore, the project goals to be
accomplished, and the purpose for initiating a project can also be written clearly without
doubt, ambiguity, or uncertainty of any kind.
Another useful purpose of creating a problem statement is that it can be used as a
communications mechanism to others. It helps the project team identify any support staff and
other kinds of expertise that may be needed to complete the project. Before the start of the
project, the people involved need to understand the problem, and goals, not only from a team
perspective but also from an individual contributor’s perspective. This step makes it clear what
the goal of the project is as well as the role that each team member will play in the execution
of the project.
Defining the boundary of the problem
Every problem has limitations to an extent beyond which the solution devised to solve the
problem is not applicable. The extent of these limitations is known as the boundary of the
problem. In reality, there is no physical boundary, but more an understanding that exists
between each team member.
To define the boundary of a problem it is important to focus on the real issues that make up
the problem. This can be achieved only when there is a thorough clarification of these issues. It
is understood by all team members that the boundary is a clear demarcation between the
factors that will greatly impact the problem and lesser-affecting factors. Lesser-affecting factors
are not considered to be within the boundaries of the problem definition and are thus not
considered when creating the solution.

22
It is important to understand that for each person dealing with devising solutions for a
problem, or set of problems, that the problem boundary will differ. This is based on their
understanding of the problem. Their understanding will also be affected by their concerns and
any underlying human biases. This may pose slight setbacks in the project. However, as the
design team works cohesively, any biases are more often than not taken care of early in the
process.
How will data be accessed, managed and analyzed?
Machine learning is an especially important component of artificial intelligence which relies
heavily on data. What a machine learning device will achieve or not is based on the kind of
data it is given as input. Hence, it is vitally important that the input data is acquired from
sources that are reliable and are not tampered with or altered in any way.
Every algorithm needs to be fed a particular kind of data depending on the expected outcome
to be performed by the machine. Training any AI-based algorithm often requires thousands or
even millions of points of information. The data required often may be unavailable as access to
it may conflict with privacy rules or government regulations on sharing data. In other words,
gaining access to data is a complicated process. Hence, it is required to have best practices
policies in place to ensure that all AI related systems follow the rules of privacy and security
uniformly.
What are the privacy and security aspects around the data collection?
There have been incidents in the past that have raised questions as to the sanctity of data
being used by machine learning devices. It is mandatory for the organization handling the data
to ensure that the security of that data is maintained and is far out of reach from people with
malicious intentions.
The amount of data collected and stored each day is enormous. Data organizations gather data
from innumerable sources such as live data sources, blogs, social media, and other sources,
which is quite an extensive task and therefore we need to have strong, robust data
management systems and stringent laws to protect it from misuse of any kind.
Access to data
Data Access Statements are statements that are created to document the datasets required to
support a specific purpose and the necessary conditions under which they can be found and
used.
Research data archive and repository organizations provide users with a permanent identifier
for data they housed known as a Digital Object Identifier (DOI) or accession number.

23
Openly available data
The following are things that should be provided in a Data Access Statement:
 The name(s) of the repositories/archives of the dataset(s), managing the dataset(s)
 The persistent identifier for your dataset
 Data subject to access restrictions
 Justification for the data to be subject to access restrictions (for example, ethical, legal, or
commercial sensitivity)
 Information on arrangements for accessing the data, including the persistent identifier to the
dataset, or a statement that the data are not accessible
 If you have used secondary or third-party data information the data source should be credited
 If you have used secondary or third-party data, you can provide information on how the data
were accessed
Data Access Statements can also be combined with formal data citations, particularly when a
publication is supported by multiple datasets in different locations.

Development and Understanding of the BOT

Framework
What is a BOT?
Have you ever watched someone in your family do shopping from an e-commerce website?
Often, you will find that a small window pops up at an extreme corner of the screen offering
assistance in case they are facing difficulty with shopping or making a payment. This
application holds a chat with the human user in a human-like manner and is known generically
as a BOT, an abbreviation of Robot.

24
Fig 2.16: A user communicating with a chat BOT of an online shopping site

A BOT can be defined as an application that is specifically designed to perform a specific

automated task. Some examples of BOTs are Cortana (Microsoft), and Clippy (Microsoft).

Fig 2.17: Clippy, an early Microsoft office assistant

Though BOTs are capable of performing helpful tasks, many malicious BOTs exist which can
install a virus onto your computer and cause great damage. However, we will be studying
CHATBOTs that are designed to hold conversations with a human and help by providing them
with the desired information they require in the most accurate manner possible.

What can a BOT do?

CHATBOTs exist that can order food, write an email for us, set alarms, tell us about our current
finances, help save us money, shop for us, and find tourist spots, monuments and even
restaurants close by to our current location. For example, Digit is a CHATBOT that helps
manage expenses by showing you your bank balance and upcoming bills and tries to help save
you money by offering you relevant financial services via text messages. Another example is
the ‘WHO CHATBOT’ available in Microsoft Teams, which searches for anyone inside your
organization, or Calendar BOT, also in Microsoft Teams, which helps you find the best date
and time to meet with the people you want to by checking your diary against the other
person’s calendar.

25
The BOT Framework

Fig 2.18: A Generic BOT Framework

A generic BOT framework consists of a communication channel between the user and the
CHATBOT. The CHATBOT translates the text written by the user using natural language
processing (NLP) and translates it into a language that is understood by the machines.
The CHATBOT then communicates with a cloud service to fetch the necessary response and
translates it from machine language to the natural language of the user and communicates the
response to them.

26
Data Labeling
One of the most important requisites of supervised learning is the labeling of data. With
artificial intelligence having more of an impact on our daily routines, there is a constant need
to upgrade the machines in order to continue to provide results with ever enhanced accuracy.
To accomplish this the data input into the algorithm must be precisely labeled.
Look at the image in Figure 2.19 closely. What do you see? In it, unlabeled data gives a learning
machine no information about what is in the image.

Fig 2.19: Unlabeled data is difficult for Machine Learning

As such the machine cannot learn much about it and therefore the outcomes are inaccurate.
From a machine point of view the need for accurately labeled data is of the utmost priority in
order for it to understand what the image shows, what is written in a piece of text, and even
what a sound recording contains.

27
In the subsequent image (Figure 2.20), the data has been labeled. The machine can now easily
identify what is understood from the image and can find similar patterns from other images
when fed similar data. Data labeling is a process that involves putting electronic boundary
boxes on image files and tagging them with keywords that are both related and relevant to the
item within the boundary. It can also involve many other processes such as marking a human
face with points to analyze facial features for use in person identification search engines such
as those used by the police. Another important aspect is the categorization of texts, audio files
and videos, based on their content. In our example, the tag would be a ‘car’ as the traffic image
shows many varieties of vehicles including cars, mini trucks, open vans, two-wheelers, buses
etc.

Fig 2.20: Labeled data is easy for Machine Learning

As mentioned earlier, labeling of data may also involve the identification and marking of
certain points on the face such as the nose, eyes etc. Data marked like this is done from various
angles in order that the machine can recognize the human face more appropriately. The
labeling occurs repetitively in the image, a car in our example, is done to teach the machine
that the label applies to the car irrespective of how it looks, from what position, what color, and
the angle the image of the car was captured.
Similarly, the machine needs to learn the analysis of both text and the sentiments in order to
produce accurate textual outcomes. In a text scenario, the natural language is structured in
such a manner that the algorithm can understand and compute the relevant meaning of the
text. For spoken text, the machine needs to understand not only the word but also the tone
and context in which the word is spoken to correctly gauge the true meaning of the word and
attribute the same to an emotion.

28
Machine Learning and CHATBOTs
Robots and Humans
With advancements in the field of science and robotics, we have slowly entered an era where
robots can be found doing many tasks both at work and in our personal lives. They can now be
found doing daily household chores such as vacuuming, driving vehicles, disarming bombs,
controlling artificial (prosthetic) limbs, support surgical procedures, manufacture products,
entertain, teach, and a lot more.
Why do you think robots are being created to perform certain kinds of work when traditionally
humans have been doing them for years? The main reason is that of speed and accuracy, and
reduction of threat of life. A robot can work faster and more efficiently when compared to a
human; this is why assembly lines use robot machines. The tasks to be performed are routine
and do not have unexpected variations. Robots are also being used on farms to help farmers
with the removal of weeds or unwanted plants from the field. Another use of robotics is to
minimize human errors when performing tasks.

BOTS – Redefining Workplaces

Fig 2.21: Conversational BOTs at Workplaces

Image credit: https://chatbotsmagazine.com/7-types-of-bots-8e1846535698

A BOT is a computer program that performs automatic repetitive tasks. It also acts as the
primary tool for automating interactions with website content on a large scale. BOTs on the
Internet is not a new concept and has been around for many years. BOT software is easy to
implement and can serve a variety of purposes.

29
The Science behind the Generation of a BOT

Fig 2.22: The Lifecycle of a Bot

Building blocks - Program for a BOT

Building a BOT may appear to be quite intimidating. However, it can be made quite easy
thanks to a variety of tools and techniques. Here are the few general steps required to create a
BOT.
Stage 1: Requirements
Be aware of who the target user is, what their concerns are, and what benefits the solution is to
deliver. Gathering these market requirements is the first step towards the creation of a BOT.
Stage 2: Spec (Specification)
This is the product specification for the creation of a BOT. It puts down in writing the features
and necessary functionality required of the BOT. These features should be identified in Stage 1.
Note - The spec must also include a short and long description of the BOT along with other
things that will be required later at the publishing stage.
Stage 3: Script
This stage is used to build conversational scripts that represent user interactions with the BOT.
The scripts must represent actual user conversations.
Stage 4: Architect
Once the above-mentioned steps are done, the next step is to produce the engineering design
for the creation of the BOT which includes both the front-end and back-end components. The
front-end refers to the conversational interface, whilst the backend refers to computations
performed by the BOT as well as interactions to other web services.
30
Stage 5: Development
In this stage, BOT developers use an iterative coding and testing development process. As soon
as the BOT is coded to handle a specific set of conversational statements, it’s good practice to
test the code via the messaging interface. It is also a good habit to insert tracking probes into
the BOT to ‘Track’ various stages of its implementation.
Stage 6: Test
The testing process is deeply intertwined with the development process. The code must be
tested not just in a BOT emulator, but also in the actual messaging platform in a real-time
scenario.
Stage 7: Deploy
Once the BOT is built it must be deployed to the hosted environment. The hosted environment
must be stable and needs to offer monitoring and development support.

31
Demonstration

Fig
2.23: LUIS in action
Click on the link to watch the video demonstration: https://youtu.be/9tdkIQ-nkdo

32
Introduction to Supervised Machine Learning
Major machine learning methods
There are many widely adopted machine learning methods such as supervised and
unsupervised learning.

Supervised learning
Supervised Learning algorithms are trained using labeled examples, from an input where the
desired output is known. The learning algorithm receives a collection of inputs and the
corresponding correct outputs and learns by comparisons between its actual output and
previous correct outputs in order to identify errors, and modifies its model accordingly. Using
strategies such as classification, regression, prediction and others, supervised learning uses
patterns to predict the values of the label on unlabeled data. Supervised learning is usually
employed in applications wherever historical knowledge can easily predict future events.
There is another form of machine learning known as unsupervised machine learning. In this
form of machine-learning there are no labeled examples and the outcome is unknown.

Fig 2.24: Pictorial

representation of
Supervised Learning

Semi-Supervised learning
This type of learning is used for the same types of applications as supervised learning, but uses
a mix of labeled and unlabeled data. Semi-supervised learning is useful when the cost
associated with labeling is too high to allow for a fully labeled training process. Early samples
of this is seen in systems that distinguish an individual's face on a web-based camera from
other faces.

33
Fig 2.25: Pictorial representation of Semi-supervised Learning

Reinforcement learning
This form of learning is used in robotics, gaming and navigation applications. In reinforcement
learning the algorithm discovers through trial and error which actions yield the greatest
rewards. This type of learning has three primary components: the agent (the learner or
decision-maker), the environment (everything the agent interacts with) and actions (what the
agent can do).
The objective is for the agent to choose actions that maximize the expected reward over a
given amount of time. The agent will reach the goal much faster by following a good strategy.
The goal in reinforcement learning is to learn the best strategy.

Fig 2.26: Pictorial representation of Reinforced Learning

34
Supervised Machine Learning
Supervised machine learning is said to take place when the machine algorithm is provided with
labeled data. This can help the machine understand the data and to generalize a model based
on it. Supervised learning works on ‘training data’ without which the algorithm cannot give the
correct results. The use of training data is to guide the machine learning profile.
The training data consist of a set of training examples. In supervised learning, each example is
a pair consisting of an input object and the desired output value.
A supervised learning algorithm analyzes the training data and produces an function which
infers a result based on the input and past inferences. This is constantly mapped with similar
and new examples. An optimal scenario will allow the algorithm to correctly determine the
class labels for instances never encountered before.

Classroom Activity
 Suppose you had a basket filled with a number of different shaped blocks (circle, square,
triangle and rectangular).
 Your task is to arrange them into groups.
 To understand the task first assign names to these shapes.
 We have four types of block called circle, square, triangle and rectangular. But they could be
called anything.
You learn from previous information about the physical characters of the blocks, so arrange
some of the blocks from the basket of the same type together. In data mining terminology the
earlier work is called training the data.
Now take a new block from the basket, note its size and shape and put it in the right group
based on what you have learn from an analysis of previous blocks.
This is Supervised Learning. The dataset you will have used to classify the blocks will be as
follows:

Ser. Description of the block Block Name

1. Round without corners Circle

2. With 4 straight sides of equal length and 4 corners Square

3. With 3 straight sides of equal or unequal length and 3 corners Triangle

4. With 4 straight sides (opposite sides are equal) and 4 corners Rectangle

35
Table 1: Dataset for Classroom Activity

Introduction to Unsupervised Machine Learning

Earlier we had read about various types of machine learning. as well as the supervised way of
learning there is also an unsupervised machine learning model, when the machine is not
familiar with the data that is being used as the input.

Unsupervised learning
This kind of learning is used with data that has no historical labels and therefore cannot use
these to help it learn. However, the dataset may contain a few data points that are labeled. The
algorithm needs to compute and analyze based purely on this limited amount of labelled data
available and learn how to automatically label those that are not. The goal is to explore the
data and recognize some patterns contained within.
Unsupervised learning works well on transactional data. For example, it can identify customers
with similar attributes who can then be treated similarly in marketing campaigns. Or it can
notice the most prominent attributes that a particular group of customers have. Popular
techniques which use this form of learning include self-organizing maps, nearest-neighbor
mapping, marketing analysis etc.

Fig 2.27: Pictorial representation of Unsupervised Learning

Earlier, we learnt that the machine-based algorithm will learn from datasets which are labeled
in order to provide guidance to the algorithm to compute the answer. As stated earlier, there
can be times when the dataset may not have any labels or may have been partially labeled. In
such a scenario, the machine still learns.

36
Clustering
With no formal guide on the labeling of data points, unsupervised machine learning is
dependent on something called ‘clustering’. Clusters are found by separate algorithms that run
on the machine to identify them only if they exist in the dataset.
Clustering can be defined as the process of dividing the data points into groups in a way that
all the data points in a group share similar properties. The main goal of clustering is to create
groups with data points which have similar traits among them and assign them a label.
A few types of clustering are as follows:
 K–Means Clustering – Involves clustering the data points into K number of clusters. Finding out
the exact number of K clusters is a complex process.
 Hierarchical Clustering – Involves clustering of the data points into parent and child clusters.
 Probabilistic Clustering – Involves clustering of the data points into clusters on the probabilistic
scale that they are identical.
Clustering can be used in a variety of applications. Some of them can be:
 Recommendation engines
 Market segmentation
 Social network analysis
 Search result grouping
 Medical imaging
 Image segmentation
 Anomaly detection

37
Assessments Questions
1. Define data and list examples of data that you can think your school would gather from
others or prepare itself and how it could be used.

2. What do you associate as sentiment in a human being?

3. How is machine learning useful in meteorology?

4. How can customer services be improved by using sentiment analysis?

5. Write down the correct term for each of these definitions:

6. Why Machine learning is the need of today's world?

7. Give at least 4 real-life examples of how machines are replacing humans.

8. Using any example from your daily life to explain what you have understood from
unsupervised and supervised learning.

9. Explain the use of reinforcement learning in gaming.

10. What is the process of marking human face related data points?

11. What is Machine Learning?

12. What are the probable applications of Supervised Machine Learning?

13. Why is labeled data easier for a machine to learn from?

38
14. Fill in the Blanks

 In Data mining terminology, the earlier work is called as ________.

 Supervised learning uses _____.

 A generic BOT framework consists of a ________.

 A CHATBOT translates the text written by the user using a ______

True or False
 Unlabeled data is less expensive and takes less effort to acquire.

 Semi supervised learning uses only unlabeled data.

 Reinforcement learning, the algorithm discovers through trial and error which actions yield the
greatest rewards.

39
Questions to consider
 Name the major differences between alarms and reminders.
https://medium.com/truemd/whats-the-difference-between-an-alarm-and-a-reminder-a73c11dc1a73

 What probable challenges will Cortana face when playing a song that has a remake version too?
https://www.howtogeek.com/402579/i-used-a-cortana-smart-speaker-all-weekend.-heres-why-it-failed/

 Probe the various aspects of the debate of ‘human vs. BOT interaction’.
https://www.intercom.com/blog/bots-versus-humans/
https://www.retaildive.com/news/70-of-consumers-still-want-human-interaction-versus-bots/543324/
https://cyfuture.com/blog/the-great-bot-battle-ai-chatbots-vs-human-powered-live-chat

 Investigate the various data labeling approaches and find out the pros and cons with suitable
examples.
https://www.kdnuggets.com/2018/05/data-labeling-machine-learning.html

 Can a BOT replace humans?

https://www.bbntimes.com/en/technology/the-rise-of-chatbots-will-they-replace-humans

 Can BOTs turn malicious?

https://www.webroot.com/us/en/resources/tips-articles/what-are-bots-botnets-and-zombies
https://www.symantec.com/blogs/feature-stories/malicious-bot-attacks-why-theyre-more-dangerous-ever

 Elaborate the steps for data protection? Name the steps to annotate the data?
https://www.richardsandrichards.com/6-steps-to-complete-data-protection-for-your-small-business/
https://resources.infosecinstitute.com/how-to-implement-a-data-privacy-strategy-10-steps/#gref
https://medium.com/thelaunchpad/spinning-up-an-annotation-team-c74c6765531b

 Deduce how much data is required for analysis by the machine?

https://machinelearningmastery.com/much-training-data-required-machine-learning/
https://towardsdatascience.com/how-do-you-know-you-have-enough-training-data-ad9b1fd679ee

 Find out communities which give free data for research?

https://www.nature.com/sdata/policies/repositories

 Design steps for ‘Tic Tac Toe’ so that winning chances of computer is maximum.
https://www.wikihow.com/Win-at-Tic-Tac-Toe

40
 Calculate the best moves to solve the ‘Tower of Hanoi’ problem.
https://en.wikipedia.org/wiki/Tower_of_Hanoi

 Write down the different appliances that generate datasets, in a smart home
o Security camera related data
o Thermostat related data
o Electricity consumption data
-

41
Some Practical Assignments/Lab Work
Assignment 1
Use the Bing search engine to prepare a report on the following.
A. Types of machine learning.
B. Heuristic search in AI.
C. Knowledge representation technique.
D. AI based games for competitive entertainment such as Chess.
Assignment 2
Outline the challenges that can be encountered in problem identification.
https://www.toolshero.com/problem-solving/problem-definition-process/
Assignment 3
Evaluate the importance and role of an identifier in the dataset.
https://www.ngdc.noaa.gov/wiki/index.php/Data_Set_Identifiers_and_other_Unique_IDs
https://www.dataone.org/best-practices/provide-identifier-dataset-used
Assignment 4
Create datasets for the following:
 Image processing – Create a dataset of at least 100 images of natural scenes
 Sentiment Analysis – Chose a product of your choice and search more than one ecommerce
website. Write down all the reviews (not less than 100) written for the product in a document
under the website name.
 Video processing – Shoot or download birthday party videos (not less than 50) and collect
them in a folder.
 IoT – Download (any three) IoT data online. Suggested – Weather data, traffics data, agriculture
data and smart phone data.


42
Assignment 5
Suggested questions for different scenarios:
Club membership enquiry
 What is your age (check for eligibility)?
 What games do you play?
 What are your play timings?
 Duration for membership – quarterly, bi-monthly, half yearly or yearly?
Career guidance or higher education
 Percentage scored in class 10th exams?
 Present options for subjects?
What is the user preference – (present streams – science, arts, commerce, commerce with
mathematics etc.)
Design a BOT interaction possible and questions and answers based on the same for Club
membership enquiry scenario.
Design a BOT interaction possible questions and answers for career guidance or higher
education in AI scenario.
Assignment 6
Suggested keywords
 Staff communication mails – staff, teacher, educator, subject, class teacher, discipline,
permission, class, etc.
 Parent complaint mails – ward, mother, father, guardian, class, student, complaint, unaware
etc.
 Educational bodies mails – authority, board, school, inspection, requirements etc.
 Vendor mails – vendor, issue, payment, permission, principal, office, dated etc.
 Co-Curricular notification mails – notification, circular, school, district, state, level,
competitions, class, participation, participate etc.
It is to be noted that there could be certain keywords that would be common to more than one
communication type. The machine is expected to focus on both similar and dissimilar keywords
and labels to identify and segregate.
Consider the scenario of a school where the principal needs help from a machine-based
application with mail segmentation. Students are to consider segregation of the mails in the
following categories
 Staff communication mails
 Parent complaint mails
 Educational bodies mails
 Vendor mails

43
 Co-Curricular notification mails

Assignment 7
Prepare a detailed report on how machines develop intelligence and learn from reinforcement
methodology in a game of chess.
https://www.infoworld.com/article/3400876/reinforcement-learning-explained.html
Assignment 8
Design a student’s assistance program for students with low performance. How can AI assist in
identifying the weak students?

44
Assignment 9
Refer to the website below to understand the IRIS dataset and answer the following questions.
https://archive.ics.uci.edu/ml/datasets/iris
A. What are the features/attributes of the dataset?
B. What are the targets/classes of the dataset?
C. How many rows are there in the dataset?
D. Are there any missing values in the dataset?
E. Is the data univariate or multivariate?
F. If we follow 60:20:20 pattern for train, validate and test, how many rows will be there in each of
the dataset?

Assignment 10
Perform classification of students to understand who would be interested in joining the sports
club of the school.
Lab Session -1: Create a dataset or use any existing free dataset
Lab Session -2: Study the dataset of the students
Lab Session -3: Dataset should include:
 Student ID/Admission number
 Interest in the sports (name of the sport in which the student is interested to participate)
 Achievement in the sports
 Academic scores
 Distance from school to home
 Height and shoe size

45
Practical Assignments
Assignment 1
Imagine a situation at home where your family is expecting guests. You have lights at various
locations both indoors and outdoors. There are lights at the doorway, near the gate, along the
pathway, in the garden and also in the interior rooms of the house. Each family member has a
different option about which light to switch on. Write down the problem statement and
alternative solutions.
Assignment 2
Imagine a hypothetical situation where you are looking forward for various applicable career
choices with a help of a CHATBOT assisting you in the process. Prepare a set of
question/answer trails.
Assignment 3
Create a bank of images (more than 50). It should contain images of people with emotions
(various ages, color, expressions etc.). Using Microsoft’s online ‘Face and Emotion Recognition’
application, run the images to predict the emotion and analyze visual content.
https://aidemos.microsoft.com/face-recognition

46
Further Reading
 https://www.forbes.com/sites/willemsundbladeurope/2018/10/18/data-is-the-foundation-for-artificial-
intelligence-and-machine-learning/#3eccba5251b4
 https://towardsdatascience.com/role-of-data-science-in-artificial-intelligence-950efedd2579
 http://www.dbta.com/BigDataQuarterly/Articles/The-Importance-of-Data-for-Applications-and-AI-
129316.aspx
 https://www.technative.io/data-quality-vs-data-quantity-whats-more-important-for-ai/
 https://pjreddie.com/darknet/yolo/
 https://www.houseofbots.com/news-detail/3581-4-understand-the-machine-learning-from-scratch-for-
beginners
 https://www.minigranth.com/artificial-intelligence/problem-solving-in-artificial-intelligence/
 Problem Solving in Artificial Intelligence by Prof Philippe Codognet Link -
http://webia.lip6.fr/~codognet/PSAI/1-introduction.pdf
 Introduction to Artificial Intelligence: Problem Solving and Search by by Berhard Beckert 2004. Link -
https://formal.iti.kit.edu/~beckert/teaching/Einfuehrung-KI-WS0304/04ProblemSolving.pdf
 Learning problem solving (artificial intelligence, machine) by Bruce Walter Porter by University of
California, Irvine 1984.
 Learning problem solving strategies using refinement and macro generation by HA Güvenir, GW Ernst,
Elsevier Science Publishers B.V. (North-Holland) 1990. Link -
http://repository.bilkent.edu.tr/bitstream/handle/11693/26215/bilkent-research-paper.pdf?
sequence=1&isAllowed=y
 Microsoft Power BI Dashboards Step by Step 1st Edition by Errin O'Connor
 The 5 Clustering Algorithms Data Scientists Need to Know - https://towardsdatascience.com/the-5-
clustering-algorithms-data-scientists-need-to-know-a36d136ef68
 Clustering Introduction & different methods of clustering -
https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-
clustering/
 What is Data Labeling? - https://www.youtube.com/watch?v=_BasmAAub7w
 Why Smart Labeling is the Future of Data Annotation - https://www.youtube.com/watch?
v=V33Ut36eUsY
 Four Mistakes You Make When Labeling Data - https://towardsdatascience.com/four-mistakes-you-
make-when-labeling-data-7e431c4438a2
 Practical Machine learning problems - https://machinelearningmastery.com/practical-machine-learning-
problems/
 https://www.messengerpeople.com/chatbots-what-is-a-whatsapp-bot-actually/
 https://www.geeksforgeeks.org/what-is-reinforcement-learning/
 https://deepsense.ai/what-is-reinforcement-learning-the-complete-guide/

47
Reference Links
 Algorithmia (2018). Introduction to Unsupervised Learning | Algorithmia Blog. [online]
Algorithmia Blog. Available at: https://blog.algorithmia.com/introduction-to-unsupervised-
learning/ [Accessed 10 Sep. 2019].
 Al-Masri, A. (2019). What Are Supervised and Unsupervised Learning in Machine Learning?
[online] Medium. Available at: https://towardsdatascience.com/what-are-supervised-and-
unsupervised-learning-in-machine-learning-dc76bd67795d [Accessed 6 Sep. 2019].
 Author (2019). Data labeling service: training data for machine learning | Clickworker. [online]
Clickworker.com. Available at: https://www.clickworker.com/crowdsourcing-glossary/data-
labeling/ [Accessed 6 Sep. 2019].
 Automationanywhere.com. (2019). TAKE CHARGE OF THE BOT LIFECYCLE. [online] Available at:
https://www.automationanywhere.com/in/solutions/enterprise-bot-lifecycle-management
[Accessed 12 Jul. 2019].
 Brownlee, J. (2015). Basic Concepts in Machine Learning. [online] Machine Learning Mastery.
Available at: https://machinelearningmastery.com/basic-concepts-in-machine-learning/
[Accessed 29 Jun. 2019].
 Chen, J. (2019). Neural Network Definition. [online] Investopedia. Available at:
https://www.investopedia.com/terms/n/neuralnetwork.asp [Accessed 27 Sep. 2019].
 Chris, (2009) How To Write A Problem Statement | Ceptara. 2009. How To Write A Problem
Statement | Ceptara. [online] Available at: http://www.ceptara.com/blog/how-to-write-problem-
statement. [Accessed 04 July 2019].
 Decypher. (2018). Machine Learning: What it is and Why it Matters - Decypher. [online] Available
at: https://www.decypher.com/machine-learning-matters/ [Accessed 4 Jul. 2019].
 Dietrich, D., Heller, B. and Yang, B. (2015). Data Science & Big Data Analytics: Discovering,
Analyzing, Visualizing and Presenting Data. [ebook] Indianapolis: John Wiley & Sons, Inc., pp.29-
30. Available at: http://index-of.co.uk/Big-Data-Technologies/Data%20Science%20and%20Big
%20Data%20Analytics.pdf [Accessed 13 Sep. 2019].
 Guru99team (2019). Supervised Machine Learning: What is, Algorithms, Example. [online]
Guru99.com. Available at: https://www.guru99.com/supervised-machine-learning.html [Accessed
6 Sep. 2019].
 Kaushik, S. (2016). Clustering Introduction & different methods of clustering. [online] Analytics
Vidhya. Available at: https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-
clustering-and-different-methods-of-clustering/ [Accessed 10 Sep. 2019].
 Loon, R. (2018). Machine learning explained: Understanding supervised, unsupervised, and
reinforcement learning. [online] Big Data Made Simple. Available at: https://bigdata-
madesimple.com/machine-learning-explained-understanding-supervised-unsupervised-and-
reinforcement-learning/ [Accessed 6 Sep. 2019].
 Maj, M. (2019). Object Detection and Image Classification with YOLO. [online] Kdnuggets.com.
Available at: https://www.kdnuggets.com/2018/09/object-detection-image-classification-
yolo.html [Accessed 29 Jun. 2019].
48
 McFadin, P. (2019). Internet of Things: Where Does the Data Go? [Online] WIRED. Available at:
https://www.wired.com/insights/2015/03/internet-things-data-go/ [Accessed 15 Nov. 2019].
 Sarah Mitroff. 2019. What is a BOT? - CNET. [ONLINE] Available at: https://www.cnet.com/how-
to/what-is-a-bot/. [Accessed 05 July 2019].
 Sheth, B. (2016). The BOT Lifecycle. [online] CHATBOTs Magazine. Available at:
https://chatbotsmagazine.com/the-bot-lifecycle-1ff357430db7 [Accessed 12 Jul. 2019].
 Shoemaker, C. (2019). IoT Data: How to Collect, Process, and Analyze Them. [Online] Tech.
Available at: https://it.toolbox.com/blogs/carmashoemaker/iot-data-how-to-collect-process-
and-analyze-them-032619 [Accessed 15 Nov. 2019].
 Simmons, D. (2019). Pushing IoT Data Gathering, Analysis, and Response to the Edge - DZone IoT.
[Online] dzone.com. Available at: https://dzone.com/articles/pushing-iot-data-gathering-
analysis-and-response-to-the-edge [Accessed 15 Nov. 2019].
 Smith, A. (2018). Understanding Architecture Models of CHATBOT and Response Generation
Mechanisms - DZone AI. [online] dzone.com. Available at:
https://dzone.com/articles/understanding-architecture-models-of-chatbot-and-r [Accessed 12
Jul. 2019].
 University of Bath, (2019) Data access statements - Archiving and sharing data - Library at
University of Bath. 2019. Data access statements - Archiving and sharing data - Library at
University of Bath. [online] Available at: https://library.bath.ac.uk/research-data/archiving-and-
sharing/data-access-statements. [Accessed 04 July 2019].
 University of Nebraska-Lincoln, (2019) Remember the 5 W's | IT Best Practices | Nebraska. 2019.
Remember the 5 W's | IT Best Practices | Nebraska. [online] Available at:
https://its.unl.edu/bestpractices/remember-5-ws. [Accessed 04 July 2019].

49
Glossary
Ancient - belonging to the very distant past and no longer in existence.
Logic - a system or set of principles underlying the arrangements of elements in a computer or
electronic device so as to perform a specified task.
Algorithms - a process or set of rules to be followed in calculations or other problem-solving
operations, especially by a computer.
Perceptions - The way in which something is regarded, understood, or interpreted.
Intervention - The action or process of intervening.
Complex - A group or system of different things that are linked in a close or complicated way;
a network.
Segmentation - Division into separate parts or sections.
Sentiment - Feelings of tenderness, happiness, sadness, or nostalgia.
Emotion - a strong feeling deriving from one's circumstances, mood, or relationships with
others.
Polarity - The state of having two opposite or contradictory tendencies, opinions, or aspects.
Parameter - a numerical or other measurable factor forming one of a set that defines a system
or sets the conditions of its operation.
Non-linear - Not arranged in a straight line.
Crux - The decisive or most important point at issue.
Application - A program or piece of software designed to fulfil a particular purpose.
Data mining - The practice of examining large pre-existing databases in order to generate
new information.
Stakeholder - A person with an interest or concern in something
Narrative - A spoken or written account of connected events.
Untagged - Of a piece of text or data not identified or categorized by a tag.

SwiftUI For Masterminds 3rd Edition
No ratings yet
SwiftUI For Masterminds 3rd Edition
1,584 pages
The Kickbook
No ratings yet
The Kickbook
230 pages
Cse4005 Writ1
No ratings yet
Cse4005 Writ1
16 pages
7143cem Portfolio March2023 Brief
No ratings yet
7143cem Portfolio March2023 Brief
9 pages
Assignment 1. Module No. 1 Data Structures and Algorithm (BOTE)
No ratings yet
Assignment 1. Module No. 1 Data Structures and Algorithm (BOTE)
2 pages
More Tools in Flash: Computer Grade 7 Keyboard (3Rd Edition)
100% (1)
More Tools in Flash: Computer Grade 7 Keyboard (3Rd Edition)
6 pages
Educator Guide - Module 2 Machine Learning PDF
No ratings yet
Educator Guide - Module 2 Machine Learning PDF
55 pages
UE20CS302 Unit3 Slides
No ratings yet
UE20CS302 Unit3 Slides
308 pages
Classification of Code Mixed Dravidian Text Using Deep Learning
No ratings yet
Classification of Code Mixed Dravidian Text Using Deep Learning
7 pages
Note On Data Analytics
No ratings yet
Note On Data Analytics
21 pages
Merge +1
No ratings yet
Merge +1
107 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
11 pages
Bev S4hana2022 BPD en Ae
No ratings yet
Bev S4hana2022 BPD en Ae
31 pages
Simulation
No ratings yet
Simulation
38 pages
Data Analytics Program Brochure
No ratings yet
Data Analytics Program Brochure
29 pages
Logistic Regression
100% (1)
Logistic Regression
12 pages
Business Analytics - Training Curriculum - SKOLAR
No ratings yet
Business Analytics - Training Curriculum - SKOLAR
16 pages
Data Science Engineering Full Time Program Brochure
No ratings yet
Data Science Engineering Full Time Program Brochure
21 pages
Portfolio 1
No ratings yet
Portfolio 1
15 pages
ST2187 Block 5
No ratings yet
ST2187 Block 5
15 pages
Analytics Prepbook Laterals 2019-2020
100% (1)
Analytics Prepbook Laterals 2019-2020
40 pages
Inductive Learning and Machine Learning
100% (1)
Inductive Learning and Machine Learning
321 pages
AnalytixLabs - Data Science With Python
No ratings yet
AnalytixLabs - Data Science With Python
13 pages
DSTP2.0-Batch-05 DBI101 3
No ratings yet
DSTP2.0-Batch-05 DBI101 3
3 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
Ôn Tập Applied Big Data in Management
No ratings yet
Ôn Tập Applied Big Data in Management
43 pages
2nd Unit - 2.2 - Data Analytics
No ratings yet
2nd Unit - 2.2 - Data Analytics
22 pages
Dsa 375
No ratings yet
Dsa 375
7 pages
Modul 4 - Function Tuple Dictionaries and Data Processing
0% (1)
Modul 4 - Function Tuple Dictionaries and Data Processing
20 pages
Lecture - 4 Classification (Naive Bayes)
No ratings yet
Lecture - 4 Classification (Naive Bayes)
33 pages
Internship Report - Software - Salaries Predictions
100% (1)
Internship Report - Software - Salaries Predictions
17 pages
Career Track Brochure - Data Science
No ratings yet
Career Track Brochure - Data Science
39 pages
R Programming
No ratings yet
R Programming
11 pages
Data Cleansing
No ratings yet
Data Cleansing
30 pages
Escript Induction
No ratings yet
Escript Induction
42 pages
Project
No ratings yet
Project
14 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
Complete Download Data Analytics 1st Edition - Ebook PDF PDF All Chapters
100% (4)
Complete Download Data Analytics 1st Edition - Ebook PDF PDF All Chapters
51 pages
Complete Download An Introduction to Statistical Learning: with Applications in Python Gareth James PDF All Chapters
No ratings yet
Complete Download An Introduction to Statistical Learning: with Applications in Python Gareth James PDF All Chapters
55 pages
Data Science Specialization Brochure
No ratings yet
Data Science Specialization Brochure
16 pages
Chapter 6 Introduction To Predictive Analytics
No ratings yet
Chapter 6 Introduction To Predictive Analytics
46 pages
Handwritten Hindi Character Recognition Using MultipleClassifiers in Machine Learning
No ratings yet
Handwritten Hindi Character Recognition Using MultipleClassifiers in Machine Learning
6 pages
Introduction To R: Arin Basu MD MPH Dataanalytics
No ratings yet
Introduction To R: Arin Basu MD MPH Dataanalytics
33 pages
Curriculum Guide: Artificial Intelligence and Machine Learning: Business Applications
No ratings yet
Curriculum Guide: Artificial Intelligence and Machine Learning: Business Applications
8 pages
R-Cheatsheet: Help Numerical Summaries Linear Regression
No ratings yet
R-Cheatsheet: Help Numerical Summaries Linear Regression
2 pages
Jiri Cejka, Senior Manager, Dipl - El.-Ing, CISA
No ratings yet
Jiri Cejka, Senior Manager, Dipl - El.-Ing, CISA
40 pages
Digital Finance - Data Engineering Lead
No ratings yet
Digital Finance - Data Engineering Lead
3 pages
Internship Report On Upgrad
No ratings yet
Internship Report On Upgrad
44 pages
Data Quality Analyst: Professiona L Profile
No ratings yet
Data Quality Analyst: Professiona L Profile
2 pages
Steps To Upload Documents On Launchpad
33% (3)
Steps To Upload Documents On Launchpad
1 page
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II Part-II-1
65 pages
Technical Analysis RA
No ratings yet
Technical Analysis RA
27 pages
Data Analytics Question Bank
No ratings yet
Data Analytics Question Bank
27 pages
Lecture2 DataMiningFunctionalities
No ratings yet
Lecture2 DataMiningFunctionalities
18 pages
Lecture+Notes (Upgrad)
No ratings yet
Lecture+Notes (Upgrad)
5 pages
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
No ratings yet
L T P Credits Artificial Intelligence Lab - 2 1: Identifying Problems and Their AI Solutions
4 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Business Process As A Service Case Study For Government Presentation
No ratings yet
Business Process As A Service Case Study For Government Presentation
24 pages
Data Visualization: For Analytics and Business Intelligence
No ratings yet
Data Visualization: For Analytics and Business Intelligence
49 pages
p2 28
No ratings yet
p2 28
2 pages
Unit 4
No ratings yet
Unit 4
27 pages
Evans - Analytics2e - PPT - 05 Data Modelling
100% (2)
Evans - Analytics2e - PPT - 05 Data Modelling
98 pages
cc65fd71699019121330 (1)
No ratings yet
cc65fd71699019121330 (1)
64 pages
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Current Event Homework Template
100% (1)
Current Event Homework Template
8 pages
AvayaAura Administering SystemManager Rls 8 1 X Issue 26 Feb 2023
No ratings yet
AvayaAura Administering SystemManager Rls 8 1 X Issue 26 Feb 2023
1,594 pages
1.1.1 Simple Linear Regression
No ratings yet
1.1.1 Simple Linear Regression
4 pages
WINCC View SOLAR INVERTER
No ratings yet
WINCC View SOLAR INVERTER
9 pages
PBI E-Book
No ratings yet
PBI E-Book
122 pages
Wfa97410 (Gpon Onu)
No ratings yet
Wfa97410 (Gpon Onu)
3 pages
SIH2024
No ratings yet
SIH2024
6 pages
Edt GD 1
No ratings yet
Edt GD 1
402 pages
Diagnostic Ultrasound System: No. MSDUS0027EAD
No ratings yet
Diagnostic Ultrasound System: No. MSDUS0027EAD
2 pages
Ravana Prabhu1-16
No ratings yet
Ravana Prabhu1-16
75 pages
Is2120 Oops c++ Handout
No ratings yet
Is2120 Oops c++ Handout
7 pages
Sabp A 086 - 10272021
No ratings yet
Sabp A 086 - 10272021
40 pages
Resume
No ratings yet
Resume
3 pages
List Comprehension
No ratings yet
List Comprehension
5 pages
Shenzhen Divi Electronic Co.,Ltd: 5A Fast Charging
No ratings yet
Shenzhen Divi Electronic Co.,Ltd: 5A Fast Charging
7 pages
V10_Win7_8_10_notes_5202
No ratings yet
V10_Win7_8_10_notes_5202
3 pages
Microsoft Cloud Networking For Enterprise Architects
No ratings yet
Microsoft Cloud Networking For Enterprise Architects
12 pages
0417 s24 QP 32
No ratings yet
0417 s24 QP 32
3 pages
Box-Jenkins Methodology Forecasting Basics
No ratings yet
Box-Jenkins Methodology Forecasting Basics
11 pages
C++ Lab Worksheet 5.1
No ratings yet
C++ Lab Worksheet 5.1
19 pages
(Ebook) Analysis And Design Of Algorithms by Amrinder Arora ISBN 9781634870214, 9781634870733, 1634870212, 1634870735 instant download
100% (1)
(Ebook) Analysis And Design Of Algorithms by Amrinder Arora ISBN 9781634870214, 9781634870733, 1634870212, 1634870735 instant download
35 pages
Roots of Real Numbers: Algebra 2 Honors Name - Radicals Notes 1 (6.1)
No ratings yet
Roots of Real Numbers: Algebra 2 Honors Name - Radicals Notes 1 (6.1)
15 pages
Matlab 1
No ratings yet
Matlab 1
13 pages
Pandas Overview
No ratings yet
Pandas Overview
4 pages
VSP F350 F370 v88 03 2x Hardware Reference MK-97HM85016-02
No ratings yet
VSP F350 F370 v88 03 2x Hardware Reference MK-97HM85016-02
110 pages
Car Detection From Low-Altitude UAV Imagery With
No ratings yet
Car Detection From Low-Altitude UAV Imagery With
11 pages