Module
Module
• 1. Data storage
• Facilities for storing and retrieving huge amounts of data are an important component of the
learning process. Humans and computers alike utilize data storage as a foundation for advanced
reasoning.
• In a human being, the data is stored in the brain and data is retrieved using electrochemical
signals.
• Computers use hard disk drives, flash memory, random access memory and similar devices
to store data and use cables Unit
and other technology
- V Introduction to Machine Learningto
Bose
/ Drretrieve data.
Subash Chandra
Learning
• 2. Abstraction
• The second component of the learning process is known as abstraction. Abstraction is the
process of extracting knowledge about stored data. This involves creating general concepts
about the data as a whole. The creation of knowledge involves application of known models
• and creation of new models.
• The process of fitting a model to a dataset is known as training. When the model has been
trained, the data is transformed into an abstract form that summarizes the original
information.
• 3. Generalization
• The third component of the learning process is known as generalization. The term
generalization describes the process of turning the knowledge about stored data into a form
that can be utilized for future action. These actions are to be carried out on tasks that are
similar, but not identical, to those what have been seen before. In generalization, the goal is
to discover those properties of the data that will be most relevant to future tasks.
https://youtu.be/voKs59e1FQ8
✔ According to Tom Mitchell, “A computer program is said to be learning from experience (E),
with respect to some task (T). Thus, the performance measure (P) is the performance at task
T, which is measured by P, and it improves with experience E.”
✔ Example: In Spam E-Mail detection,
✔ Task, T: To classify mails into Spam or Not Spam.
✔ Performance measure, P: Total percent of mails being correctly classified as being “Spam” or
“Not Spam”.
✔ Experience, E: Set of Mails with label “Spam”
Step 2. Choosing target function: The next important step is choosing the target function. It means
according to the knowledge fed to the algorithm the machine learning will choose NextMove function
which will describe what type of legal moves should be taken. For example : While playing chess with the
opponent, when opponent will play then the machine learning algorithm will decide what be the number
of possible legal moves taken in order to get success.
“Problem of searching through a predefined space of potential hypotheses for the hypothesis that
best fits the training examples”
https://youtu.be/jV0LpWF6UgY
Assume that we have collected data for some attributes/features of the day like, Sky, Air Temperature,
Humidity, Wind, Water, Forecast. Let these set of instances be denoted by X and many concepts can
defined over the X. For example, the concepts can be - Days on which my friend Sachin enjoys his favorite
water sport - Days on which my friend Sachin will not go outside of his house.
Target concept — The concept or function to be learned is called the target concept and denoted by c. It
can be seen as a boolean valued function defined over X and can be represented as c: X → {0, 1}.
For the target concept c , “Days on which my friend Sachin enjoys his favorite water sport”, an attribute
EnjoySport is included in the below dataset X and it indicates whether or not my friend Sachin enjoys his
favorite water sport on that day.
So far we looked into what is concept , target concept and concept learning. Also extended the
definition of concept, target concept and concept learning to an example where my friend Sachin
enjoys the water sport on certain day. Also looked into the hypothesis representation.
With this knowledge we can say, EnjoySport concept learning task requires learning the sets of days
for which EnjoySport=yes and then describing this set by a conjunction of constraints over the
instance attributes.
We know that, the Inductive learning algorithm tries to induce a “general rule” from a set of
observed instances. So the above case is same as inductive learning where a learning algorithm
is trying to find a hypothesis h (general rule) in H such that h(x) = c(x) for all x in D. For a given
collection of examples, in reality, learning algorithm return a function h (hypothesis) that
approximates c (target concept). But the expectation is, the learning algorithm to return a
function h (hypothesis) that equals c (target concept) ie. h(x) = c(x) for all x in D.
So we can define, Inductive learning hypothesis is any hypothesis found to approximate the
target function well over a sufficiently large set of training examples will also approximate the
target function well over any other unobserved examples.