0% found this document useful (0 votes)
17 views20 pages

UNIT 1

The document provides an overview of machine learning (ML) as a subfield of artificial intelligence (AI), emphasizing its focus on learning from data without explicit programming. It discusses various learning problems, perspectives, and key issues in ML, including supervised, unsupervised, and reinforcement learning, as well as concepts like decision trees and inductive bias. Additionally, it outlines the candidate elimination algorithm and the importance of data quality, model complexity, and interpretability in the learning process.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

UNIT 1

The document provides an overview of machine learning (ML) as a subfield of artificial intelligence (AI), emphasizing its focus on learning from data without explicit programming. It discusses various learning problems, perspectives, and key issues in ML, including supervised, unsupervised, and reinforcement learning, as well as concepts like decision trees and inductive bias. Additionally, it outlines the candidate elimination algorithm and the importance of data quality, model complexity, and interpretability in the learning process.

Uploaded by

charugeshm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

NCVRT- AI& ML

MACHINE
LEARNING
UNIT-1
▪Learning Problems – Perspectives and Issues – Concept
Learning – Version Spaces and Candidate Eliminations –
Inductive bias – Decision Tree learning – Representation –
Algorithm – Heuristic Space Search.
WHAT IS MACHINE LEARNING
▪ Machine learning (ML) is a subfield of artificial intelligence (AI) that focuses on enabling
computer systems to learn from data without being explicitly programmed.
▪ Subset of AI: Machine learning is a specific approach to achieving artificial intelligence.
While AI is a broader concept encompassing any technique that allows computers to mimic
human intelligence, ML focuses on learning from data.
▪ Learning from Data: Instead of being given explicit instructions for every task, ML
algorithms are trained on large datasets. They identify patterns, relationships, and insights
within this data.
▪ Without Explicit Programming: The core idea is that the machine learns how to perform a
task by analyzing data, rather than a programmer writing specific code for every possible
scenario.
▪ Improvement Through Experience: As ML models are exposed to more data, their
performance on the given task typically improves.
1.LEARNING PROBLEMS –
PERSPECTIVES AND ISSUES
▪ What is a Learning Problem?
▪ A learning problem arises when we want a computer system to improve its performance on a
specific task based on experience (data).
▪ Task (T): The specific problem we want to solve (e.g., classifying emails as spam or not spam,
predicting house prices).
▪ Performance (P): A metric that quantifies how well the system is performing the task (e.g.,
accuracy, precision, recall, mean squared error)
▪ Experience (E): The data that the system learns from (e.g., a collection of labeled emails,
historical house prices).
▪ The goal of a learning algorithm is to use the experience (E) to improve the system's
performance (P) on the task (T).
2.PERSPECTIVES ON
LEARNING PROBLEMS:
▪ 1.Supervised Learning: Learning from labeled data (input-output pairs). The goal is to learn a mapping
function that can predict the output for new, unseen inputs.
▪ Classification: Predicting a discrete output label (e.g., cat/dog, spam/not spam)
▪ Regression: Predicting a continuous output value (e.g., house price, temperature)

▪ 2.Unsupervised Learning: Learning from unlabeled data. The goal is to discover hidden patterns, structures, or
relationships in the data.
▪ Clustering: Grouping similar data points together.
▪ Dimensionality Reduction: Reducing the number of features while preserving important information.
▪ Association Rule Mining: Finding relationships between different items in a dataset.

▪ 3.Reinforcement Learning: Learning through interaction with an environment. An agent learns to take actions
that maximize a reward signal.
▪ 4.Semi-Supervised Learning: Learning from a combination of labeled and unlabeled data.
▪ 5.Active Learning: The learning algorithm strategically queries a user or oracle to label the most informative
data point
KEY ISSUES IN MACHINE
LEARNING
▪ 1. Data Acquisition and Preparation: Obtaining sufficient, relevant, and high-quality data is
crucial. This involves data cleaning, preprocessing, feature engineering, and handling missing
values.
▪ 2. Choosing the Right Representation: How the data is represented (features) significantly
impacts the learning process and the model's performance.
▪ 3. Selecting the Appropriate Algorithm: Different algorithms have different strengths and
weaknesses and are suited for different types of tasks and data
▪ 4. Model Complexity and Generalization: Balancing model complexity to avoid overfitting
(performing well on training data but poorly on unseen data) and underfitting (failing to
capture the underlying patterns)
▪ 5. Bias and Fairness: Ensuring that the learning process and the resulting models are fair and
do not perpetuate or amplify existing biases in the data
KEY ISSUES IN MACHINE
LEARNING
▪ 6. Interpretability and Explainability: Understanding why a model makes certain
predictions, especially important in critical applications.
▪ 7. Scalability: Handling large datasets and complex models efficiently.
▪ 8. Evaluation and Validation: Assessing the performance of the learned model on unseen
data to ensure generalization.
▪ 9. Computational Resources: The time and computational power required for training and
deploying models.
▪ 10. Data Privacy and Security: Protecting sensitive data used for training and prediction.
▪ 11. Concept Drift: Dealing with changes in the underlying data distribution over time.
3.CONCEPT LEARNING
▪ What is a Concept?
▪ In machine learning, a concept is a boolean-valued function defined over a set of instances. It represents a category or a set
of items that belong together. The goal of concept learning is to learn this boolean function from a set of positive and
negative examples.
▪ Instance Space (X): The set of all possible objects or examples. Each instance is typically represented by a set of features.
▪ Target Concept (c): The boolean function we want to learn, where c(x)=1 if instance x belongs to the concept and c(x)=0
otherwise.
▪ Training Examples (D): A set of labeled instances, where each instance x is paired with its correct label c(x)
▪ Positive Examples: Instances for which c(x)=1
▪ Negative Examples: Instances for which c(x)=0
▪ Hypothesis Space (H): The set of all possible hypotheses (candidate concept definitions) that the learning algorithm can
consider.
▪ Learner’s Task: To find a hypothesis h in H such that h(x)=c(x) for all instances x in the instance space X.
4. VERSION SPACES AND
CANDIDATE ELIMINATIONS
▪ Version Space:
▪ The version space, with respect to a hypothesis space H and a set of training examples D, is the subset
from hypotheses from H that are consistent with all training examples in D.
▪ In other words, it's the set of all plausible hypotheses that could be the target concept given the
observed data.
▪ Candidate Elimination Algorithm:
▪ The candidate elimination algorithm is a method for finding the version space. It maintains two sets of
hypotheses:
▪ G (General Boundary): The set of the most general hypotheses in H that are consistent with all
positive examples and inconsistent with at least one negative example.
▪ S (Specific Boundary): The set of the most specific hypotheses in H that are consistent with all
positive examples and consistent with all negative examples.
ALGORITHM STEPS
▪ 1. Initialization:
▪ Initialize S to contain the most specific hypothesis (e.g., for conjunctive hypotheses, this could be a hypothesis
that matches no instances).
▪ Initialize G to contain the most general hypothesis (e.g., for conjunctive hypotheses, this could be a hypothesis
that matches all instances).
▪ 2. Processing Positive Examples: For each positive training example x:
▪ Remove any hypothesis in G that is inconsistent with x.
▪ For each hypothesis s in S that is inconsistent with x :
▪ Remove s from S
▪ Generalize s to the minimal more general hypotheses that are consistent with x.
▪ Add each of these new generalized hypotheses to S if and only if they are more specific than some hypothesis in
G.
▪ Remove any hypothesis in S that is more general than another hypothesis in S.
ALGORITHM STEPS
▪ 3. Processing Negative Examples: For each negative training example x:
▪ Remove any hypothesis in S that is consistent with x.
▪ For each hypothesis g in G that is consistent with x.
▪ Remove g from G
▪ Specialize g to the minimal more specific hypotheses that are inconsistent with x.
▪ Add each of these new specialized hypotheses to G if and only if they are more general than
some hypothesis in S.
▪ Remove any hypothesis in G that is more specific than another hypothesis in G.
▪ 4. Termination: The algorithm terminates when S and G converge to a single hypothesis (if
the target concept is learnable and uniquely identifiable within H) or when they define the
boundaries of the version space.
5. INDUCTIVE BIAS
▪ What is Inductive Bias?
▪ Inductive bias (also known as learning bias) refers to the set of assumptions that a learning
algorithm makes to generalize from the training data to unseen instances.
▪ It's the preference of the learning algorithm for one hypothesis over another, even if both are
consistent with the observed training data.
▪ Why is Inductive Bias Necessary?
▪ Without any inductive bias, a learning algorithm would have no basis for choosing one
generalization over another when faced with unseen data.
▪ For any set of training examples, there could be infinitely many hypotheses that are consistent
with them but make different predictions on new instances. Inductive bias allows the algorithm
to make reasonable generalizations.
TYPES OF INDUCTIVE BIAS
▪ Hypothesis Space Restriction: The algorithm only considers hypotheses from a specific,
limited set. This is a strong form of bias. For example, a linear regression model assumes a
linear relationship between features and the target variable
▪ Preference for Certain Hypotheses: Even within a given hypothesis space, the algorithm
might prefer some hypotheses over others.
▪ Search Bias: The way the learning algorithm searches through the hypothesis space can
introduce bias. For example, a greedy search might find a locally optimal solution that is not
globally optimal
6. DECISION TREE LEARNING
▪ In machine learning, a decision tree is a supervised learning algorithm used for both
classification and regression, represented as a flowchart-like structure that makes predictions
by following a series of decision rules.
▪ Decision trees learn from labeled data, meaning they are trained on data where the correct
outcome (or target variable) is known.
▪ Decision rules:
▪ These are the conditions or questions that determine the path taken through the tree.
▪ Training Data: The data used to train the decision tree algorithm, containing features and the
corresponding target variable.
▪ Prediction: Once trained, the decision tree can predict outcomes for new, unseen data by
following the decision rules.
HOW DECISION TREE WORKS
▪ 1. Data Input: The algorithm is fed with a dataset containing features and the target variable.
▪ 2. Feature Selection: The algorithm selects the most important features to use for splitting the
data.
▪ 3.Splitting: The data is split into subsets based on the selected features and their values.
▪ 4. Recursion: The process of splitting and selecting features is repeated recursively until a
stopping condition is met (e.g., all data in a branch is of the same class, or a maximum tree
depth is reached).
▪ 5. Prediction: When a new data point is presented, it is passed down the tree based on the
decision rules, eventually reaching a leaf node, which represents the prediction
7.DECISION TREE LEARNING-
REPRESENTATION
▪ Representation: A decision tree is a tree-like structure where:
▪ Each internal node represents a test on an attribute (feature)
▪ Each branch represents the outcome of the test.
▪ Each leaf node represents a class label (for classification) or a predicted value (for
regression).
▪ To classify a new instance, we start at the root node, perform the test on the attribute, follow
the corresponding branch, and continue this process until we reach a leaf node, which
provides the classification or prediction.
REPRESENTATION
▪ Root Node: The starting point of the tree,
representing the entire dataset.
▪ Internal Nodes: Represent decision rules
or questions based on features of the data.
▪ Branches: Represent possible outcomes or
paths based on the decision rules.
▪ Leaf Nodes: Represent the final
predictions or classifications.
8. ALGORITHM (A BASIC
GREEDY APPROACH - E.G.,
ID3, C4.5):
▪ 1. Start with all training examples at the root node.
▪ 2.If all examples at the current node belong to the same class, then the current node becomes a leaf
node labeled with that class.
▪ 3.If there are no remaining attributes to test, then the current node becomes a leaf node labeled with the
most common class among the examples at that node (majority voting).
▪ 4. Otherwise, select the "best" attribute to split the current node. The "best" attribute is typically
chosen based on a splitting criterion that aims to maximize the separation of classes in the resulting
child nodes.
▪ 5. Create child nodes for each distinct value of the selected attribute.
▪ 6. Distribute the training examples at the current node to the child nodes based on the value of the
selected attribute.
▪ 7. Recursively apply steps 2-6 to each child node
9. HEURISTIC SPACE SEARCH
▪ The process of learning a decision tree can be viewed as a heuristic search through the space of
possible decision trees.
▪ Search Space: The set of all possible decision trees that can be constructed from the given
attributes. This space is typically very large.
▪ Heuristic: The splitting criterion (e.g., information gain, gain ratio, Gini impurity) acts as a
heuristic function that guides the search towards more promising decision trees. The goal of
the heuristic is to find a tree that accurately classifies the training data and generalizes well to
unseen data.
▪ Greedy Search: Most decision tree learning algorithms employ a greedy top-down approach.
At each step, they select the locally optimal attribute to split on without backtracking or
considering alternative splits made earlier in the tree construction process. This greedy nature
means that the algorithm might not find the globally optimal decision tree.
ISSUES AND
CONSIDERATIONS IN
DECISION TREE LEARNING
▪ Overfitting: Decision trees can easily overfit the training data, especially if they grow very
deep.
▪ Handling Continuous Attributes: Continuous attributes need to be discretized or handled
using split points.
▪ Handling Missing Values: Strategies are needed to deal with instances where some attribute
values are missing.
▪ Computational Complexity: Building a decision tree can be computationally expensive,
especially with a large number of attributes and training examples.
▪ Bias: Decision trees have a bias towards features with more levels (in information gain) and
can be sensitive to the order of attributes considered.
▪ Representation Power: Decision trees can represent complex decision boundaries, but they
might struggle with certain types of functions (e.g., XOR)

You might also like