0% found this document useful (0 votes)

33 views

A Sampling of Various Other Learning Methods

The document describes decision tree induction and genetic algorithms as methods for machine learning. Decision tree induction involves building a tree-like model to predict an outcome by splitting a dataset into subgroups based on attribute values. Genetic algorithms are inspired by biological evolution and involve generating a population of candidate solutions, evaluating their fitness, and breeding new solutions through techniques like crossover and mutation until an optimal solution emerges.

Uploaded by

Dhriti Tuli

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views

A Sampling of Various Other Learning Methods

Uploaded by

Dhriti Tuli

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 34

A Sampling of Various Other

Learning Methods

1
Decision Tree Induction

2
Decision Tree Induction
An example decision tree to solve the problem of how to spend
my free time (play soccer or go to the movies?)

Outlook

Sunny Overcast Rain

Humidity Soccer Wind

High Normal High Normal

Movies Soccer Movies Soccer

3
Decision Tree Induction

The Decision Tree can be, alternatively, thought as a collection of

rules:

R1: Outlook=sunny, Humidity=high  Decision=Movies

R2: Outlook=sunny, Humidity=normal  Decision=Soccer

R3: Outlook=overcast  Decision=Soccer

R4: Outlook=rain, Wind=strong  Decision=Movies

R5: Outlook=rain, Wind=normal  Decision=Soccer

4
Decision Tree Induction

The Decision Tree can yet be thought of as concept learning:

For example the concept “good day for soccer” is the disjunction
of the following conjunctions:

(Outlook=sunny and Humidity=normal) or

(Outlook=overcast) or

(Outlook=rain and Wind=normal)

5
Decision Tree Induction

Decision Trees can be automatically learned from data using either

information-theoretic criteria or a measure of classification
performance.
The basic induction procedure is very simple in principle:

1. Start with an empty tree

2. Put at the root of the tree the variable that best classifies the
training examples
3. Create branches under the variable corresponding to its values
4. Under each branch repeat the process with the remaining
variables
5. Until we run out of variables or sample

6
Decision Tree Induction

Notes :
o “best classifies” can be determined on the basis of maximizing
homogeneity of outcome in the resulting subgroups, cross-
validated accuracy, best-fit of some linear regressor, etc.
o DTI is best for:
 Discrete domains
 Target function has discrete outputs
 Disjunctive/conjunctive descriptions required
 Training data may be noisy
 Training data may have missing values

7
Decision Tree Induction

Notes (CONT’D) :
o DTI can represent any finite discrete-valued function
o Extensions for continuous variables do exist
o Search is typically greedy and thus can be trapped in local
minima
o DTI is very sensitive to high feature-to-sample ratios; when
many features contribute a little to classification DTI does
not do well
o DT models are highly intuitive, and easy to explain and
use, even without computing equipment available

8
Supplementary Readings
 S.K. Murthy: “Automatic Construction of decision trees from data: A
multi-disciplinary survey”. Data Mining and Knowledge Discovery,
1997

9
Genetic Algorithms

10
Genetic Algorithms
Evolutionary Computation (Genetic Algorithms & Genetic
Programming) is motivated by the success of evolution as a robust
method for adaptation found in nature
The standard/prototypical genetic algorithm is simple:

1. Generate randomly a population P of p hypotheses

2. Compute the fitness of each member of P, hi
3. Repeat
a. Create a random sample Ps from P by choosing each hi with probability
proportional to the relative fitness of hi to the total fitness of all hj
b. Augment Ps with cross-over offspring of the remaining hypotheses
chosen with same probability as in step #4
c. Change members of Ps at random by bit-mutations
d. Replace P by Ps and compute new fitness of each member of P
4. Until enough generations have been created or a good enough hypothesis
have been generated
5. Return best hypothesis

11
Genetic Algorithms
Representation of hypotheses in GAs is typically a bitstring so that the
mutation and cross-over operations can be achieved easily.
E.g., consider encoding clinical decision-making rules:
variable1: fever {yes, no}
variable2: x_ray {positive, negative}
variable3: diagnosis {flu, pneumonia}
Rule1: fever=yes and x_ray=positive  diagnosis=pneumonia
Rule2: fever=no and x_ray=negative  diagnosis= flu or pneumonia

Bitstring representation:
R1: 10 10 01
R2: 01 01 11

(note: we can constrain this representation by using less bits, the fitness
function, and syntactic checks)
12
Genetic Algorithms
Let’s cross-over these rules at a random point:

R1: 10 10 01
R2: 01 01 11
Gives:

R1’: 10 01 11
R2’: 01 10 01

And mutation at one random bit may give:

R1’’: 10 00 11
R2’’: 01 10 01

Which is interpreted as:

Rule1’’: fever=yes and x_ray=unknown  diagnosis=flu or pneumonia
13
Rule2’’: fever=negative and x_ray=positive  diagnosis=pneumonia
Genetic Algorithms

Notes:
• The population size, cross-over rate, and mutation rate are
parameters that are set empirically
• There exist variations of how to do cross over, how to select
hypotheses for mutation/cross-over, how to isolate subpopulations,
etc.
• Although it may appear at first that the process of finding better
hypotheses relies totally on chance, this is not the case. Several
theoretical results (most famous one being the “Schema Theorem”
prove that exponentially more better-fit hypotheses are being
considered than worse-fit ones (to the number of generations).
• Furthermore, due to the discrete nature of optimization local minima
will trap the algorithm less, but also it becomes more difficult to find
the global optimum.
• It has been shown that GA perform an implicit parallel search in
hypotheses templates without explicitly generating them (“Implicit
Paralellism”.
14
Genetic Algorithms
Notes:
• GAs are “black box” optimizers (i.e., applied without any special
knowledge about the problem structure); sometimes they are applied
appropriately to learn models when no better alternative can be
reasonably found, and when they do have a chance for finding a
good solution.
• There exist cases however when much faster and provably sound
algorithms can (and should) be used, as well as cases where
uninformed heuristic search is provably not capable of finding a
good solution or scale up to large problem inputs (and thus should
not be used).

Consider, for example, the problems of finding the shortest path

between two cities in a map, of sorting numbers, of solving a linear
program, of fitting a linear model, etc. for all these cases better and
faster special-purpose algorithms exist and should be used instead .

15
 In addition:
– The No Free Lunch Theorem (NFLT) for Optimization states that
no black box optimizer is better than any other averaged over all
possible distributions and objective functions
– There are broad classes of problems for which GAs problem-
solving is NP-hard
– There are types of target functions that GAs cannot learn
effectively (e.g., “Royal Road” functions as well as highly
epistatic functions)
– The choice of parameters is critical in producing a solution; yet
finding the right parameters is NP-hard in many cases
– Due to extensive evaluation of hypotheses it is easy to overfit
– The “Biological” metaphor is conceptually useful but not crucial;
there have been equivalent formulations of GAs that do not use
concepts such as “mutation”, “cross-over” etc.

16
Supplementary Readings

– Belew et al: “Optimizing an arbitrary function is hard for the genetic

algorithm” Proc. Intl. Conf. On Genetic Algorithms, 1991

– O.J. Sharp: “Towards a rational methodology for using evolutionary

search algorithms. PhD thesis University of Essex, 2000

– R. Salomon: “Raising theoretical Questions about the utility of genetic

algorithms”, 6th annual conf. Evol. Programming, 1997

– R. Salomon: “ Derandomization of Genetic Algorithms” Eufit '97 -- 5th

European Congress on Intelligent Techniques and Soft Computing

– S. Baluja et al.: “Removing the Genetics from the Standard Genetic

Algorithm” The Proceedings of the 12th Annual Conference on Machine
Learning, 1995, pp. 38 - 46.

17
K-Nearest Neighbors

18
K-Nearest Neighbors
Assume we wish to model patient response to treatment; suppose we have
seen the following cases:

Patient# Treatment type Genotype Survival

-----------------------------------------------------------------------------------------
1 1 1 1
2 1 2 2
3 1 1 1
4 1 2 2
5 2 1 2
6 2 2 1
7 2 1 2
8 2 2 1
(Notice the very strong interaction between treatment and genotype in
determining survival)
19
K-Nearest Neighbors
Say we want to predict outcome for a patient i that received treatment 1
and is of genotype class 2. KNN searches for the K most similar cases in
the training data base (using Euclidean Distance or other similarity
metric):

ED(xi,xj) = S (x i,k – xj,k)2

For example patient #1 and the new patient have ED=

Patient# Treatment type Genotype Survival

-----------------------------------------------------------------------------------------
1 1 1 1
i 1 2 ?

ED = ((1-1)2 + (1-2)2) = 1

20
K-Nearest Neighbors

Similarly the distances of case i to all training cases are:

Patient# ED(Patient#, Pi) Survival

-----------------------------------------------------------------------------------------
1 1 1
2 0 2
3 1 1
4 0 2
5 1.4 2
6 1 1
7 1.4 2
8 1 1

Now let’s rank training cases according to distance to case i

21
K-Nearest Neighbors
Patient# ED(Patient#, Pi) Survival
-----------------------------------------------------------------------------------------
2 0 2
4 0 2
3 1 1
1 1 1
6 1 1
8 1 1
5 1.4 2
7 1.4 2

As we can see the training case most similar to i has outcome 2. The 2 training cases
most similar to i have a median outcome 2. The 3 training cases most similar to i
have a median outcome 2, and so on. We say that for K=1 the KNN predicted
value is 2, for K=2 the predicted value is 2, and so on.

22
K-Nearest Neighbors
To summarize:

KNN is based on a case-based reasoning framework.

It has good asymptotic properties for K>1.
It is straightforward to implement (of course care has to be given to
variable encoding, variable relevance, and distance metric); efficient
encoding is not easy since it requires specialized data structures.
It is used in practice as:
o a baseline comparison for new methods
o component algorithm for “wrapper” feature selection methods
o Non-parametric density estimator

23
Clustering

24
Clustering
Unsupervised class of methods
Basic idea: group similar items together and different items
apart
Countless variations:
o of what constitutes “similarity” (may be distance in feature space,may
be other measures of association),
o of what will be clustered (patients, features, time series, cell-lines,
combinations thereoff, etc.)
o of whether clusters are “hard” (no multi-membership) or “fuzzy”
o of how clusters will be build and organized (partitional,
agglomerative, non-hierarchical methods)
Uses:
o Taxonomy (e.g., identify molecular subtypes of disease)
o Classification (e.g., classify patients according to genomic
information)
o Hypothesis generation (e.g., if genes are highly “co-expressed” then
this may suggest they are in same pathway) 25
Clustering
K-means clustering: We want to partition the data into k most-similar
groups

1. Choose k cluster centers (“centroids”) to coincide with k randomly

chosen patterns (or arbitrarily-chosen points in the pattern space)
2. Repeat
3. Assign each pattern in data to cluster with the closest centroid
4. Recompute new centroids
5. Until convergence (i.e., few or no re-assignments or small decrease in
error function such as total sum of squared errors of each pattern in a
cluster from centroid of that cluster)

Variations:
- selection of good initial partitions
- Allow splitting/merging of resulting clusters
- Various similarity measures and convergence criteria

26
Clustering (k-means)
e.g., (K=2)
A B C D E F
2 3 9 10 11 12

Step 1: (arbitrarily)

[A B C D] [E F]

Centroid1=6, centroid2=11.5

Step 2:

[A B] [C D E F]

Centroid1=2.5, centroid2=10.25

-------(algorithm stops)--------
27
Clustering
Agglomerative Single Link:

1. Start with each pattern belonging to its own cluster

2. Repeat
3. Join these two clusters that have the smallest pair-wise distance
4. Until all patterns are in one cluster

Note:
- Inter-cluster distance between clusters A and B is computed as
the minimum distance of all pattern pairs (a,b) s.t. a belongs to A
and b to B

28
Clustering (ASL)
e.g.,
A B C D
1 2 5 7

Step 1:
[A] [B] [C] [D] smallest distance [A] [B]=1

Step 2:
[A B] [C] [D] smallest distance [C] [D]=2

Step 3:
[A B] [C D] smallest distance [A B] [C D]=3

Step 4:
[A B C D] smallest distance [C][D]=2

-------(algorithm stops)--------

29
Clustering (ASL)
e.g.,
A B C D E F
1 2 5 7 11 12

Step 1: [A] [B] [C] [D] [E] [F] smallest distance [A] [B]=1 OR [E] [F]=1

Step 2: [A B] [C] [D] [E] [F] smallest distance [E] [F]=1

Step 3: [A B] [C] [D] [E F] smallest distance [C] [D]=2
Step 4: [A B] [C D] [E F] smallest distance [A B] [C D]=3
Step 5: [A B C D] [E F] smallest distance [A B C D] [E F]=4
Step 6: [A B C D E F] -------(algorithm stops)--------

Schematic representation via the “dendrogram”:

A B C D E F
30
Clustering
Agglomerative Complete Link:

1. Start with each pattern belonging to its own cluster

2. Repeat
3. Join these two clusters that have the smallest pair-wise distance
4. Until all patterns are in one cluster

Note:
- Inter-cluster distance between clusters A and B is computed as the
maximum distance of all pattern pairs (a,b) s.t. a belongs to A and b to B

31
Clustering (ACL)
e.g.,
A B C D E F
1 2 5 7 11 12

Step 1: [A] [B] [C] [D] [E] [F] smallest distance [A] [B]=1 OR [E] [F]=1

Step 2: [A B] [C] [D] [E] [F] smallest distance [E] [F]=1

Step 3: [A B] [C] [D] [E F] smallest distance [C] [D]=2
Step 4: [A B] [C D] [E F] smallest distance [A B] [C D]=6
Step 5: [A B C D] [E F] smallest distance [A B C D] [E F]=11
Step 6: [A B C D E F] -------(algorithm stops)--------

With dendrogram:

A B C D E F

32
Clustering

Clustering has been very prevalent so far in bioinformatics

Papers with a Mesh indexing keyword of statistics account for 5.9%
of all pubmed articles.
In oligo array papers this jumps to 13.3%.
Cluster analysis accounts for 26% of statistics-related papers on
oligo arrays, and 16.7% of genetic network-related papers.
In Nature Genetics cluster analysis is used 71.4% on all statistics-
related papers.
In CAMDA 2000, cluster analysis was used in 27% of all papers.

33
Clustering
Caveats:
a. There does not exist a good understanding on how to translate from
“A and B cluster together” to: “A and B are dependent/independent
causally/non-causally”
b. There exist very few studies outlining what can be learned or cannot
be learned with clustering methods (learnability), how reliably
(validity, stability), with what sample (sample complexity). Such
analyses exist for a variety of other methods. The few existing
theoretical results point to significant limitations of clustering
methods.
c. Other comments: visual appeal, familiarity, small sample, no explicit
assumptions to check, accessibility, tractability.

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (82)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Cse 590 Data Mining: Prof. Anita Wasilewska SUNY Stony Brook
No ratings yet
Cse 590 Data Mining: Prof. Anita Wasilewska SUNY Stony Brook
66 pages
5.3 Supervised & Reinforcement
No ratings yet
5.3 Supervised & Reinforcement
30 pages
Genetic Algorithms ML
No ratings yet
Genetic Algorithms ML
40 pages
CHAPTER 6 Machine Learning: Objective
No ratings yet
CHAPTER 6 Machine Learning: Objective
29 pages
Genetic: Introduction To Genetic Algorithms
No ratings yet
Genetic: Introduction To Genetic Algorithms
44 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Decision Tree Classifiers With Ga Based Feature Selection
No ratings yet
Decision Tree Classifiers With Ga Based Feature Selection
10 pages
Artificial Intelligence in Biomedical Engineering
No ratings yet
Artificial Intelligence in Biomedical Engineering
25 pages
ML Unit-IV Chapter-I Genetic Algorithms
No ratings yet
ML Unit-IV Chapter-I Genetic Algorithms
35 pages
DM Algorithms (1)
No ratings yet
DM Algorithms (1)
14 pages
Genetic Algorithms For Game Programming
No ratings yet
Genetic Algorithms For Game Programming
39 pages
Genectic Algorithm Intro
No ratings yet
Genectic Algorithm Intro
41 pages
Soft Computing Unit-5 by Arun Pratap Singh
100% (1)
Soft Computing Unit-5 by Arun Pratap Singh
78 pages
Questions
No ratings yet
Questions
57 pages
Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
No ratings yet
Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
43 pages
Genetic Algorithm: Review and Application: Manoj Kumar, Mohammad Husian, Naveen Upreti & Deepti Gupta
No ratings yet
Genetic Algorithm: Review and Application: Manoj Kumar, Mohammad Husian, Naveen Upreti & Deepti Gupta
4 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
20 pages
A.Townsend - Genetic Algorithms - A Tutorial
No ratings yet
A.Townsend - Genetic Algorithms - A Tutorial
52 pages
Genetic Algorithms Tutorial
No ratings yet
Genetic Algorithms Tutorial
52 pages
GADataMining CNA
No ratings yet
GADataMining CNA
73 pages
Module 5
No ratings yet
Module 5
11 pages
Bioinspired Algorithms and Applications
No ratings yet
Bioinspired Algorithms and Applications
42 pages
Genetic Algorithm Report
No ratings yet
Genetic Algorithm Report
26 pages
Lect6 PDF
No ratings yet
Lect6 PDF
66 pages
Unit-5
No ratings yet
Unit-5
52 pages
Visit: Http://Pimpalepatil - Googlepages.co M
No ratings yet
Visit: Http://Pimpalepatil - Googlepages.co M
28 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
56 pages
ML_UNIT_4
No ratings yet
ML_UNIT_4
40 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
45 pages
Genetic Algorithms in Artificial Neural Network (Autosaved)
No ratings yet
Genetic Algorithms in Artificial Neural Network (Autosaved)
19 pages
2.1-Genetic Algorithms
No ratings yet
2.1-Genetic Algorithms
97 pages
Evolutionary Computation and Its Applications: Dr. K.Indira
No ratings yet
Evolutionary Computation and Its Applications: Dr. K.Indira
78 pages
Question: What Are The Basic Building Blocks of Learning Agent? Explain Each of Them With A Neat Block Diagram
No ratings yet
Question: What Are The Basic Building Blocks of Learning Agent? Explain Each of Them With A Neat Block Diagram
15 pages
Informed Search Techniques 2
No ratings yet
Informed Search Techniques 2
18 pages
UNIT - 5
No ratings yet
UNIT - 5
9 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
136 pages
SC Unit 5
No ratings yet
SC Unit 5
19 pages
Evolutionary Computation: 22c: 145, Chapter 9
No ratings yet
Evolutionary Computation: 22c: 145, Chapter 9
64 pages
Genetic Algorithms 1
No ratings yet
Genetic Algorithms 1
39 pages
Machine Learning in Embedded System
No ratings yet
Machine Learning in Embedded System
56 pages
Genetic Algorithm
No ratings yet
Genetic Algorithm
29 pages
Genetic Algorithms For The Traveling Salesman Problem-8
No ratings yet
Genetic Algorithms For The Traveling Salesman Problem-8
32 pages
CP0270 12-Apr-2012 RM01
No ratings yet
CP0270 12-Apr-2012 RM01
31 pages
Unit 0.4
No ratings yet
Unit 0.4
13 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Artificial Intelligence - Lecture 6
No ratings yet
Artificial Intelligence - Lecture 6
34 pages
GA Lec1 Intro
No ratings yet
GA Lec1 Intro
23 pages
A2 Intro To GA
No ratings yet
A2 Intro To GA
57 pages
Genetic Algorithm
100% (1)
Genetic Algorithm
40 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
Evolutionary Computation: A Unified Approach Historical Roots
No ratings yet
Evolutionary Computation: A Unified Approach Historical Roots
16 pages
A Pseudo Genetic Algorithm For Solving Best Path Problem: Behzadi@sina - Kntu.ac - Ir Alesheikh@kntu - Ac.ir
No ratings yet
A Pseudo Genetic Algorithm For Solving Best Path Problem: Behzadi@sina - Kntu.ac - Ir Alesheikh@kntu - Ac.ir
4 pages
GA - Class - Notes Unit 4
No ratings yet
GA - Class - Notes Unit 4
38 pages
NNGA 8
No ratings yet
NNGA 8
8 pages
Assaf Ga
No ratings yet
Assaf Ga
57 pages
Assaf Zaritsky Ben-Gurion University, Israel WWW - Cs.bgu - Ac.il/ Assafza
No ratings yet
Assaf Zaritsky Ben-Gurion University, Israel WWW - Cs.bgu - Ac.il/ Assafza
57 pages
CS 343: Artificial Intelligence Machine Learning: Raymond J. Mooney
No ratings yet
CS 343: Artificial Intelligence Machine Learning: Raymond J. Mooney
35 pages
(Ebook - PDF) Introduction To Genetic Algorithms With Java Applets
No ratings yet
(Ebook - PDF) Introduction To Genetic Algorithms With Java Applets
36 pages
Genetic Algothim AI
No ratings yet
Genetic Algothim AI
16 pages
Learn The Basics Of Decision Trees A Popular And Powerful Machine Learning Algorithm
From Everand
Learn The Basics Of Decision Trees A Popular And Powerful Machine Learning Algorithm
UBER AUTHOR
No ratings yet
Basic Graph Theory by MSR PDF
No ratings yet
Basic Graph Theory by MSR PDF
199 pages
Ds UNIT-1 Gate Bits
No ratings yet
Ds UNIT-1 Gate Bits
8 pages
Nelder Mead 2D
No ratings yet
Nelder Mead 2D
5 pages
Btech Cs 5 Sem Design and Analysis of Algorithm Kcs503 2022
No ratings yet
Btech Cs 5 Sem Design and Analysis of Algorithm Kcs503 2022
2 pages
Constructs of a pseudocode - Decision or Selection Classroom Excersises
No ratings yet
Constructs of a pseudocode - Decision or Selection Classroom Excersises
7 pages
Trees
No ratings yet
Trees
8 pages
Lab 8 - Sorting
No ratings yet
Lab 8 - Sorting
12 pages
5-utm
No ratings yet
5-utm
7 pages
NRP 23 Sep 2023 Amjad Ali
No ratings yet
NRP 23 Sep 2023 Amjad Ali
176 pages
Singly Linked Lists Interview Questions and Answers - Sanfoundry PDF
No ratings yet
Singly Linked Lists Interview Questions and Answers - Sanfoundry PDF
4 pages
Index UNIT - II: Regular Expressions No: Khit, Cse
No ratings yet
Index UNIT - II: Regular Expressions No: Khit, Cse
11 pages
Lab Report
No ratings yet
Lab Report
17 pages
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
No ratings yet
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
17 pages
EE 675 Lecture 27th March
No ratings yet
EE 675 Lecture 27th March
4 pages
20MCA203 Design & Analysis of Algorithms
No ratings yet
20MCA203 Design & Analysis of Algorithms
59 pages
Elementary Data Organisation-1
No ratings yet
Elementary Data Organisation-1
15 pages
Lec 10 Gauss Elimination1
No ratings yet
Lec 10 Gauss Elimination1
27 pages
NestedQuantifiers QA PDF
No ratings yet
NestedQuantifiers QA PDF
2 pages
Data Structure MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
Data Structure MCQ (Multiple Choice Questions) - Sanfoundry
15 pages
Introduction To Computer Science Interview Questions and Answers
No ratings yet
Introduction To Computer Science Interview Questions and Answers
5 pages
Systems of Inequalities Word Problem Worksheet
No ratings yet
Systems of Inequalities Word Problem Worksheet
2 pages
q13
No ratings yet
q13
4 pages
Artificial Intelligence-Semester Final Project Time Table Scheduling Problem
No ratings yet
Artificial Intelligence-Semester Final Project Time Table Scheduling Problem
17 pages
3 - DDA Algorithm
No ratings yet
3 - DDA Algorithm
9 pages
Formal Definition
No ratings yet
Formal Definition
152 pages
Data Types8
No ratings yet
Data Types8
18 pages
Turing Machine
No ratings yet
Turing Machine
39 pages
24 LFSR
No ratings yet
24 LFSR
24 pages
2.1 Test Answers
No ratings yet
2.1 Test Answers
2 pages
Dynamic Programming: by Dr.V.Venkateswara Rao
No ratings yet
Dynamic Programming: by Dr.V.Venkateswara Rao
23 pages

A Sampling of Various Other Learning Methods

Uploaded by

A Sampling of Various Other Learning Methods

Uploaded by

A Sampling of Various Other

Sunny Overcast Rain

Humidity Soccer Wind

High Normal High Normal

Movies Soccer Movies Soccer

The Decision Tree can be, alternatively, thought as a collection of

R1: Outlook=sunny, Humidity=high  Decision=Movies

R2: Outlook=sunny, Humidity=normal  Decision=Soccer

R3: Outlook=overcast  Decision=Soccer

R4: Outlook=rain, Wind=strong  Decision=Movies

R5: Outlook=rain, Wind=normal  Decision=Soccer

The Decision Tree can yet be thought of as concept learning:

(Outlook=sunny and Humidity=normal) or

(Outlook=rain and Wind=normal)

Decision Trees can be automatically learned from data using either

1. Start with an empty tree

1. Generate randomly a population P of p hypotheses

And mutation at one random bit may give:

Which is interpreted as:

Consider, for example, the problems of finding the shortest path

– Belew et al: “Optimizing an arbitrary function is hard for the genetic

– O.J. Sharp: “Towards a rational methodology for using evolutionary

– R. Salomon: “Raising theoretical Questions about the utility of genetic

– R. Salomon: “ Derandomization of Genetic Algorithms” Eufit '97 -- 5th

– S. Baluja et al.: “Removing the Genetics from the Standard Genetic

Patient# Treatment type Genotype Survival

ED(xi,xj) = S (x i,k – xj,k)2

For example patient #1 and the new patient have ED=

Patient# Treatment type Genotype Survival

Similarly the distances of case i to all training cases are:

Patient# ED(Patient#, Pi) Survival

Now let’s rank training cases according to distance to case i

KNN is based on a case-based reasoning framework.

1. Choose k cluster centers (“centroids”) to coincide with k randomly

1. Start with each pattern belonging to its own cluster

Step 2: [A B] [C] [D] [E] [F] smallest distance [E] [F]=1

Schematic representation via the “dendrogram”:

1. Start with each pattern belonging to its own cluster

Step 2: [A B] [C] [D] [E] [F] smallest distance [E] [F]=1

Clustering has been very prevalent so far in bioinformatics

You might also like