0% found this document useful (0 votes)
517 views

Find - S Algorithm

The Find-S algorithm is a concept learning algorithm that finds the most specific hypothesis that fits all the positive training examples by starting with the most specific hypothesis and generalizing it for each positive example where attributes do not match. It initializes the hypothesis to the most specific representation and replaces attribute values with '?' for positive examples where the attribute value does not match the hypothesis to generalize it, ignoring negative examples. The final hypothesis after processing all examples fits all positive examples in the most general way.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
517 views

Find - S Algorithm

The Find-S algorithm is a concept learning algorithm that finds the most specific hypothesis that fits all the positive training examples by starting with the most specific hypothesis and generalizing it for each positive example where attributes do not match. It initializes the hypothesis to the most specific representation and replaces attribute values with '?' for positive examples where the attribute value does not match the hypothesis to generalize it, ignoring negative examples. The final hypothesis after processing all examples fits all positive examples in the most general way.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Find S Algorithm

The find-S algorithm is a basic concept learning


algorithm in machine learning. The find-S algorithm
finds the most specific hypothesis that fits all the positive
examples. We have to note here that the algorithm
considers only those positive training example.

Introduction :
The find-S algorithm starts with the most specific
hypothesis and generalizes this hypothesis each time it
fails to classify an observed positive training data.
Hence, the Find-S algorithm moves from the most
specific hypothesis to the most general hypothesis.
Important Representation :

1. ? indicates that any value is acceptable for the


attribute.
2. specify a single required value ( e.g., Cold ) for
the attribute.
3. ϕindicates that no value is acceptable.
4. The most general hypothesis is represented
by: {?, ?, ?, ?, ?, ?}
5. The most specific hypothesis is represented
by: {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
Steps Involved In Find-S :

1. Start with the most specific hypothesis.


h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
2. Take the next example and if it is negative, then
no changes occur to the hypothesis.
3. If the example is positive and we find that our
initial hypothesis is too specific then we update
our current hypothesis to a general condition.
4. Keep repeating the above steps till all the training
examples are complete.
5. After we have completed all the training examples
we will have the final hypothesis when can use to
classify the new examples.
Example :
Consider the following data set having the data about
which particular seeds are poisonous.

First, we consider the hypothesis to be a more specific


hypothesis. Hence, our hypothesis would be :
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}

Consider example 1 :
The data in example 1 is { GREEN, HARD, NO,
WRINKLED }. We see that our initial hypothesis is
more specific and we have to generalize it for this
example. Hence, the hypothesis becomes :
h = { GREEN, HARD, NO, WRINKLED }
Consider example 2 :
Here we see that this example has a negative outcome.
Hence we neglect this example and our hypothesis
remains the same.
h = { GREEN, HARD, NO, WRINKLED }
Consider example 3 :
Here we see that this example has a negative outcome.
Hence we neglect this example and our hypothesis
remains the same.
h = { GREEN, HARD, NO, WRINKLED }
Consider example 4 :
The data present in example 4 is { ORANGE, HARD,
NO, WRINKLED }. We compare every single attribute
with the initial data and if any mismatch is found we
replace that particular attribute with a general case ( ” ?
” ). After doing the process the hypothesis becomes :
h = { ?, HARD, NO, WRINKLED }
Consider example 5 :
The data present in example 5 is { GREEN, SOFT, YES,
SMOOTH }. We compare every single attribute with the
initial data and if any mismatch is found we replace that
particular attribute with a general case ( ” ? ” ). After
doing the process the hypothesis becomes :
h = { ?, ?, ?, ? }
Since we have reached a point where all the attributes in
our hypothesis have the general condition, example 6 and
example 7 would result in the same hypothesizes with all
general attributes.
h = { ?, ?, ?, ? }
Hence, for the given data the final hypothesis would be
:
Final Hyposthesis: h = { ?, ?, ?, ? }

Algorithm :

1. Initialize h to the most specific hypothesis in H

2. For each positive training instance x


For each attribute constraint a, in h

If the constraint a, is satisfied by x

Then do nothing

Else replace a, in h by the next more general


constraint that is satisfied by x

3. Output hypothesis h

How To Implement Find-S Algorithm In Machine


Learning?

In Machine Learning, concept learning can be termed as


“a problem of searching through a predefined space of
potential hypothesis for the hypothesis that best fits the
training examples” – Tom Mitchell. In this article, we
will go through one such concept learning algorithm
known as the Find-S algorithm. The following topics are
discussed in this article.
• What is Find-S Algorithm in Machine Learning?
• How Does it Work?
• Limitations of Find-S Algorithm
• Implementation of Find-S Algorithm
• Use Case

What is Find-S Algorithm in Machine Learning?

In order to understand Find-S algorithm, you need to


have a basic idea of the following concepts as well:

1. Concept Learning
2. General Hypothesis
3. Specific Hypothesis

1. Concept Learning

Let’s try to understand concept learning with a real-life


example. Most of human learning is based on past
instances or experiences. For example, we are able to
identify any type of vehicle based on a certain set of
features like make, model, etc., that are defined over a
large set of features.

These special features differentiate the set of cars, trucks,


etc from the larger set of vehicles. These features that
define the set of cars, trucks, etc are known as concepts.

Similar to this, machines can also learn from concepts to


identify whether an object belongs to a specific category
or not. Any algorithm that supports concept learning
requires the following:

• Training Data
• Target Concept
• Actual Data Objects

2. General Hypothesis

Hypothesis, in general, is an explanation for something.


The general hypothesis basically states the general
relationship between the major variables. For example, a
general hypothesis for ordering food would be I want a
burger.
G = { ‘?’, ‘?’, ‘?’, …..’?’}

3. Specific Hypothesis

The specific hypothesis fills in all the important details


about the variables given in the general hypothesis. The
more specific details into the example given above would
be I want a cheeseburger with a chicken pepperoni
filling with a lot of lettuce.

S = {‘Φ’,’Φ’,’Φ’, ……,’Φ’}

Python Machine Learning Certification Training

• Instructor-led Live Sessions


• Real-life Case Studies
• Assignments
• Lifetime Access

Explore Curriculum
Now ,let’s talk about the Find-S Algorithm in Machine
Learning.

The Find-S algorithm follows the steps written below:

1. Initialize ‘h’ to the most specific hypothesis.


2. The Find-S algorithm only considers the positive
examples and eliminates negative examples. For
each positive example, the algorithm checks for
each attribute in the example. If the attribute value
is the same as the hypothesis value, the algorithm
moves on without any changes. But if the attribute
value is different than the hypothesis value, the
algorithm changes it to ‘?’.

Now that we are done with the basic explanation of the


Find-S algorithm, let us take a look at how it works.

How Does It Work?


1. The process starts with initializing ‘h’ with the most
specific hypothesis, generally, it is the first positive
example in the data set.
2. We check for each positive example. If the example
is negative, we will move on to the next example but
if it is a positive example we will consider it for the
next step.
3. We will check if each attribute in the example is
equal to the hypothesis value.
4. If the value matches, then no changes are made.
5. If the value does not match, the value is changed to
‘?’.
6. We do this until we reach the last positive example
in the data set.

Limitations of Find-S Algorithm

There are a few limitations of the Find-S algorithm listed


down below:

1. There is no way to determine if the hypothesis is


consistent throughout the data.
2. Inconsistent training sets can actually mislead the
Find-S algorithm, since it ignores the negative
examples.
3. Find-S algorithm does not provide a backtracking
technique to determine the best possible changes
that could be done to improve the resulting
hypothesis.

Implementation of Find-S Algorithm


To understand the implementation, let us try to
implement it to a smaller data set with a bunch of
examples to decide if a person wants to go for a walk.

The concept of this particular problem will be on what


days does a person likes to go on walk.

Weath Temperat Compa Humid Go


Time Wind
er ure ny ity es
Morni Stron
Sunny Warm Yes Mild Yes
ng g
Eveni Norm
Rainy Cold No Mild No
ng al
Morni Norm
Sunny Moderate Yes Normal Yes
ng al
Eveni Stron
Sunny Cold Yes High Yes
ng g
Looking at the data set, we have six attributes and a final
attribute that defines the positive or negative example. In
this case, yes is a positive example, which means the
person will go for a walk.
So now, the general hypothesis is:

Next
h0 = {‘Morning’, ‘Sunny’, ‘Warm’, ‘Yes’, ‘Mild’,
‘Strong’}

This is our general hypothesis, and now we will consider


each example one by one, but only the positive examples.

h1= {‘Morning’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}

h2 = {‘?’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}

We replaced all the different values in the general


hypothesis to get a resultant hypothesis. Now that we
know how the Find-S algorithm works, let us take a look
at an implementation using Python.

Use Case
Let’s try to implement the above example using Python.
The code to implement the Find-S algorithm using the
above data is given below.

1 import pandas as pd
2 import numpy as np
3
4 #to read the data in the csv file
5 data = pd.read_csv("data.csv")
6 print(data,"n")
7
8 #making an array of all the attributes
9 d = np.array(data)[:,:-1]
10print("n The attributes are: ",d)
11
12#segragating the target that has positive and negative examples
13target = np.array(data)[:,-1]
14print("n The target is: ",target)
15
16#training function to implement find-s algorithm
17def train(c,t):
18 for i, val in enumerate(t):
19 if val == "Yes":
20 specific_hypothesis = c[i].copy()
21 break
22
23 for i, val in enumerate(c):
24 if t[i] == "Yes":
25 for x in range(len(specific_hypothesis)):
26 if val[x] != specific_hypothesis[x]:
27 specific_hypothesis[x] = '?'
28 else:
29 pass
30
31 return specific_hypothesis
32
33#obtaining the final hypothesis
34print("n The final hypothesis is:",train(d,target))
Output:
U Tube link

1. (2817) Find-S Algorithm (concept) | Machine


Learning (2018) - YouTube
2. (2817) FIND S Algorithm | Finding A Maximally
Specific Hypothesis | Solved Example - 1 by
Mahesh Huddar - YouTube
3. (2817) Machine Learning | Find-S Algorithm -
YouTube( EASY)

You might also like