NNML Notes Unit IV.docx
NNML Notes Unit IV.docx
Unit IV
Introduction: Well-posed learning problems, Designing a Learning System, Perspectives and
Issues in Machine learning, Concept Learning and General-to-specific Ordering: A concept
learning task, Concept learning as Search, Finding a maximally specific hypothesis, Version
Spaces and Candidate elimination algorithm, Inductive Bias.
To break it down, the three important components of a well-posed learning problem are,
● Task
● Performance Measure
● Experience
To understand the topic better let’s have a look at a few classical examples,
● Learning to play Checkers:
A computer might improve its performance as an ability to win at the class of tasks that are
about playing checkers. The performance keeps improving through experience by playing
against itself.
To simplify,
T -> Play the checkers game.
P -> Percentage of games won against the opponent.
E -> Playing practice games against itself.
● Handwriting Recognition:
Handwriting recognition (HWR) is a technology that converts a user’s handwritten letters or
words into a computer-readable format (e.g., Unicode text).
Its applications are numerous, it is used in reading postal addresses, bank forms, etc.
T -> recognizing and classifying handwritten words from images.
P -> Percentage of correctly identified words.
E -> set of handwritten words with their classifications in a database.
With the use of sight scanners and advanced machine learning algorithms, it can be made
possible.
1.Image Recognition:
Image recognition is one of the most common applications of machine learning. It is
used to identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also known
as "Speech to text", or "Computer speech recognition." At present, machine learning
algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the
voice instructions.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the correct
path with the shortest route and predicts the traffic conditions.
It predicts the traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes information
from the user and sends back to its database to improve the performance.
4. Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies such
as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search for
some product on Amazon, then we started getting an advertisement for the same product
while internet surfing on the same browser and this is because of machine learning.
Google understands the user interest using various machine learning algorithms and suggests
the product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series,
movies, etc., and this is also done with the help of machine learning.
5. Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine
learning plays a significant role in self-driving cars. Tesla, the most popular car
manufacturing company is working on self-driving car. It is using unsupervised learning
method to train the car models to detect people and objects while driving.
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
We have various virtual personal assistants such as Google assistant, Alexa, Cortana, Siri.
As the name suggests, they help us in finding the information using our voice instruction.
These assistants can help us in various ways just by our voice instructions such as Play
music, call someone, Open an email, Scheduling an appointment, etc.
These virtual assistants use machine learning algorithms as an important part.
These assistant record our voice instructions, send it over the server on a cloud, and decode it
using ML algorithms and act accordingly.
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various ways that a
fraudulent transaction can take place such as fake accounts, fake ids, and steal money in the
middle of a transaction. So to detect this, Feed Forward Neural network helps us by
checking whether it is a genuine transaction or a fraud transaction.
For each genuine transaction, the output is converted into some hash values, and these values
become the input for the next round. For each genuine transaction, there is a specific pattern
which gets change for the fraud transaction hence, it detects it and makes our online
transactions more secure.
Machine learning is widely used in stock market trading. In the stock market, there is always
a risk of up and downs in shares, so for this machine learning's long short term memory
neural network is used for the prediction of stock market trends.
In medical science, machine learning is used for diseases diagnoses. With this, medical
technology is growing very fast and able to build 3D models that can predict the exact
position of lesions in the brain.
Nowadays, if we visit a new place and we are not aware of the language then it is not a
problem at all, as for this also machine learning helps us by converting the text into our
known languages. Google's GNMT (Google Neural Machine Translation) provide this
feature, which is a Neural Machine Learning that translates the text into our familiar
language, and it called as automatic translation.
b) Degree: The degree of a training experience refers to the extent up to which the learner
can control the sequence of training.
For example, the learner might rely on constant feedback about the moves played or it might
itself propose a sequence of actions and only ask for help when in need.
c) The representation of the distribution of samples across which performance will be tested
is the third crucial attribute.
This basically means the more diverse the set of training experience can be the better the
performance can get.
Once done with choosing the target function now we have to choose a representation of this
target function, When the machine algorithm has a complete list of all permitted movements,
it may pick the best one using any format, such as linear equations, hierarchical graph
representation, tabular form, and so on.
Out of these moves, the NextMove function will move the Target move, which will increase
the success rate. For example, if a chess machine has four alternative moves, the computer
will select the most optimal move that will lead to victory.
4. Choosing a Function Approximation Algorithm:
In this step, we choose a learning algorithm that can approximate the target function chosen.
This step further consists of two sub-steps, a. Estimating the training value, and b. Adjusting
the weights.
To estimate a training example, we consider the successor move, and in the case of adjusting
the weights, one uses certain algorithms like LMS, to find weights of linear functions.
Machine Learning equips organizations with the information they need to make
better-informed, data-driven choices faster than they could use traditional methods.
It isn’t, however, the mythological, magical procedure that many people imagine it to be.
Machine Learning has its own set of difficulties. Here are a few frequent machine learning
issues and how to fix them.
The lack of adequate data is one of the most serious problems in Machine Learning.
Algorithms often cause developers to spend the majority of their work on artificial
intelligence while updating. For the algorithms to perform as intended, data quality is critical.
The fundamental opponents of optimal ML are incomplete data, dirty data, and noisy data.
Noisy data, dirty data, and incomplete data are the quintessential enemies of ideal Machine
Learning. The solution to this conundrum is to take the time to evaluate and scope data with
meticulous data governance, data integration, and data exploration until you get clear data.
You should do this before you start.
Implementation Problems:
When companies opt to upgrade to machine learning, they frequently use examination
engines to help them. It’s a difficult task to combine newer machine learning algorithms with
old operations. Maintaining proper documentation and interpretation will go a long way
toward ensuring maximum utilization.
A few of the reasons or ways implementation problems can be caused are, lack of sufficient
data, data security issues, and slow deployment.
When being trained, ML algorithms will always demand a large amount of data. ML
algorithms are frequently trained on a certain data index and then used to predict future data,
a cycle that can only be expected with a large amount of work.
At a moment where the data arrangement changes, the prior “correct” model over the data set
may no longer be regarded as accurate.
On the other hand, rather than removing an element with a few missing attributes, we may
fill those empty cells. The best way to cope with these challenges in Machine Learning is to
guarantee that your data is free of gaps and can express a significant amount of information.
Another difficulty with Machine Learning is that deep analytics and machine learning in their
current forms are still a relatively young technology.
Machine Learning professionals are necessary to maintain the process from the start coding
to the maintenance and monitoring. The fields of artificial intelligence and machine learning
are still relatively new to the market.
It’s also tough to find enough resources in the form of labor. As a result, there is a scarcity of
capable representatives to design and handle scientific ingredients for ML. Data scientists
frequently require a mix of spatial knowledge as well as a thorough understanding of
mathematics, technology, and science.
In today’s world of Machine Learning, separating reality from fiction is getting increasingly
challenging. You should analyze whatever challenges you’re trying to tackle before deciding
on which AI platform to utilize.
The operations that are done manually every day with no variable output are the easiest to
automate. Before automating complicated procedures, they must be thoroughly inspected.
While Machine Learning may certainly aid in the automation of some processes, it is not
required for all automation concerns.
Segmentation of User:
Consider the data of a user’s human behavior throughout a testing period, as well as any
relevant prior habits. All things considered, an algorithm is required to distinguish between
clients who will convert to a premium version of a product and those who will not.
Based on the user’s catalog behavior, a model with this choice issue would allow the
software to generate suitable recommendations for the user.
Q. Explain the Working of Find-S Algorithm.
Answer:
The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S
algorithm finds the most specific hypothesis that fits all the positive examples. We have to
note here that the algorithm considers only those positive training example. The find-S
algorithm starts with the most specific hypothesis and generalizes this hypothesis each time it
fails to classify an observed positive training data. Hence, the Find-S algorithm moves from
the most specific hypothesis to the most general hypothesis.
Important Representation:
Algorithm
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
For each attribute constraint a, in h
If the constraint a, is satisfied by x
Then do nothing
Else replace a, in h by the next more general constraint
that is satisfied by x
3. Output hypothesis h
Example:
Consider the following data set having the data about which particular seeds are poisonous.
First, we consider the hypothesis to be a more specific hypothesis. Hence, our hypothesis
would be:
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
Consider example 1:
The data in example 1 is { GREEN, HARD, NO, WRINKLED }. We see that our initial
hypothesis is more specific and we have to generalize it for this example. Hence, the
hypothesis becomes:
h = { GREEN, HARD, NO, WRINKLED }
Consider example 2:
Here we see that this example has a negative outcome. Hence we neglect this example and
our hypothesis remains the same.
h = { GREEN, HARD, NO, WRINKLED }
Consider example 3:
Here we see that this example has a negative outcome. Hence we neglect this example and
our hypothesis remains the same.
h = { GREEN, HARD, NO, WRINKLED }
Consider example 4:
The data present in example 4 is { ORANGE, HARD, NO, WRINKLED }. We compare
every single attribute with the initial data and if any mismatch is found we replace that
particular attribute with a general case ( ” ? ” ). After doing the process the hypothesis
becomes:
h = { ?, HARD, NO, WRINKLED }
Consider example 5:
The data present in example 5 is { GREEN, SOFT, YES, SMOOTH }. We compare every
single attribute with the initial data and if any mismatch is found we replace that particular
attribute with a general case ( ” ? ” ). After doing the process the hypothesis becomes:
h = { ?, ?, ?, ? }
Since we have reached a point where all the attributes in our hypothesis have the general
condition, example 6 and example 7 would result in the same hypothesizes with all general
attributes.
h = { ?, ?, ?, ? }
Hence, for the given data the final hypothesis would be:
Final Hyposthesis: h = { ?, ?, ?, ? }
The candidate elimination algorithm incrementally builds the version space given a
hypothesis space H and a set E of examples. The examples are added one by one; each
example possibly shrinks the version space by removing the hypotheses that are inconsistent
with the example. The candidate elimination algorithm does this by updating the general and
specific boundary for each new example.
Terms Used:
● Concept learning: Concept learning is basically learning task of the machine (Learn by
Train data)
● General Hypothesis: Not Specifying features to learn the machine.
● G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
● Specific Hypothesis: Specifying features to learn machine (Specific feature)
● S= {‘pi’,’pi’,’pi’…}: Number of pi depends on number of attributes.
● Version Space: It is intermediate of general hypothesis and Specific hypothesis. It not
only just written one hypothesis but a set of all possible hypothesis based on training
data-set.
Algorithm:
Step1: Load Data set
Step2: Initialize General Hypothesis and Specific Hypothesis.
Step3: For each training example
Step4: If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically
generalizing it)
Step5: If example is Negative example
Make generalize hypothesis more specific.
Output:
G = [['sunny', ?, ?, ?, ?, ?], [?, 'warm', ?, ?, ?, ?]]
S = ['sunny','warm',?,'strong', ?, ?]