0% found this document useful (0 votes)

9 views

MLQB Unit 3

Machine learning notes

Uploaded by

dummydummi009

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

MLQB Unit 3

Machine learning notes

Uploaded by

dummydummi009

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

UNIT – 3

ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING

PART A (2 MARKS)

Q.1 Define unsupervised learning. [CO3,K1]

Ans.: In an unsupervised learning, the network adapts purely in response to its inputs. Such
networks can learn to pick out structure in their input.
Q.2 Describe semi-supervised learning. [CO3,K1]
Ans.: Semi-supervised learning uses both labeled and unlabeled data to improve supervised
learning.
Q.3 Identify ensemble method. [CO3,K1]
Ans.: Ensemble methods is a machine learning technique that combines several base models in
order to produce one optimal predictive model. It combine the insights obtained from multiple
learning models to facilitate accurate and improved decisions.
Q.4 What is cluster? [CO3,K1]
Ans.: Cluster is a group of objects that belong to the same class. In other words the similar object
are grouped in one cluster and dissimilar are grouped in other cluster.
Q.5 Explain clustering. [CO3,K2]
Ans.: Clustering is a process of partitioning a set of data in a set of meaningful subclasses. Every
data in the subclass shares a common trait. It helps a user understand the natural grouping or
structure in a data set.
Q.6 What is Bagging? [CO3,K1]
Ans.: Bagging is also known as Bootstrap aggregation, ensemble method works by training
multiple models independently and combining later to result in a strong model.
Q.7 Recall boosting. [CO3,K1]
Ans.:Boosting refers to a group of algorithms that utilize weighted averages to make weak learning
algorithms stronger learning algorithms
Q.8 What is K-Nearest Neighbour Methods? [CO3,K1]
Ans.: The K-Nearest Neighbor (KNN) is a classical classification method and requires no training
effort, critically depends on the quality of the distance measures among examples.
The KNN classifier uses mahalanobis distance function. A sample is classified according to the
majority vote of the its nearest K training samples in the feature space. Distance of a sample to its
neighbors is defined using a distance function.
Q.9 Which are the performance factors that influence KNN algorithm? [CO3,K1]
Ans.: The performance of the KNN algorithm is influenced by three main factors:
1. The distance function or distance metric used to determine the nearest neighbors.
2. The decision rule used to derive a classification from the K-nearest neighbors.
3. The number of neighbors used to classify the new example.
Q.10 Recognize K-means clustering [CO3,K1]
Ans.: k-means clustering is heuristic method. Here each cluster is represented by the center of the
cluster. The k-means algorithm takes the input parameter, k, and partitions a set of n objects into
k-clusters so that the resulting intracluster similarity is high but the intercluster similarity is low.

11. Define the Expectation-Maximization (EM) algorithm in the context of Gaussian

Mixture Models (GMM). [CO3,K1]

• Answer: The EM algorithm is an iterative method for estimating parameters in GMMs. It

involves an expectation step (E-step) and a maximization step (M-step) to update parameters and
maximize the likelihood of the data.

12. What is the primary objective of voting in ensemble learning? [CO3,K1]

• Answer: Voting in ensemble learning aims to combine the predictions of multiple models to
make a final decision, often using a majority vote.

13. In bagging, how are different subsets of the training data created for each model?
[CO3,K1]

• Answer: Different subsets are created through bootstrap sampling, where instances are randomly
selected with replacement from the original training data.

14. Provide an example scenario where boosting might be beneficial in ensemble learning.
[CO3,K2]
• Answer: Boosting is beneficial when dealing with weak learners, improving their performance
sequentially by giving more emphasis to misclassified instances.

15. How does stacking contribute to the diversity of models in an ensemble? [CO3,K1]

• Answer: Stacking leverages diverse base models by training a meta-model to combine their
predictions, capturing different perspectives and improving overall performance.

16. What is the main difference between supervised and unsupervised learning? [CO3,K1]

• Answer: In supervised learning, models are trained on labeled data with known outputs, while
unsupervised learning deals with unlabeled data, focusing on discovering patterns and structures.

17. Explain the concept of centroids in the K-means clustering algorithm. [CO3,K1]

• Answer: Centroids in K-means are the representative points for each cluster, calculated as the
mean of all data points assigned to that cluster.

18. In KNN, how is the class of a new instance determined? [CO3,K1]

• Answer: The class of a new instance is determined by the majority class among its k-nearest
neighbors based on a distance metric.

19. How does GMM handle uncertainty in cluster assignments compared to K-means?
[CO3,K1]

• Answer: GMM assigns probabilities to data points belonging to each cluster, providing a more
nuanced and probabilistic view of cluster assignments compared to the hard assignments in K-
means.

20. What is the significance of the Expectation-Maximization (EM) algorithm in Gaussian

Mixture Models (GMM)? [CO3,K1]

Answer: EM is crucial for estimating the parameters of GMMs. It iteratively updates means,
covariances, and weights to maximize the likelihood of the data, making GMMs effective for
modeling complex distributions.
PART B [16 MARKS]

1. Explain Combining Multiple Learners in detail. [CO3,K2]

Combining Multiple Learners

• When designing a learning machine, we generally make some choices like parameters of
machine, training data, representation, etc. This implies some sort of variance in performance.
For example, in a classification setting, we can use a parametric classifier or in a multilayer
perceptron, we should also decide on the number of hidden units.

1. Generating Diverse Learners:

• Different Algorithms: We can use different learning algorithms to train different base-
learners. Different algorithms make different assumptions about the data and lead to different
classifiers.

• Different Hyper-parameters: We can use the same learning algorithm but use it with different
hyper-parameters.

• Different Input Representations: Different representations make different characteristics

explicit allowing better identification.

• Different Training Sets: Another possibility is to train different base-learners by different

subsets of the training set.

Model Combination Schemes

• Different methods are used for generating final output for multiple base learners are
Multiexpert and multistage combination.

1. Multiexpert combination.
• Multiexpert combination methods have base-learners that work in parallel.

a) Global approach (learner fusion): given an input, all base-learners generate an output and
all these outputs are used, such as voting and stacking and to
b) Local approach (learner selection): in mixture of experts, there is a gating model, which
looks at the input and chooses one (or very few) of the learners as responsible for generating
the output.
Voting
• The simplest way to combine multiple classifiers is by voting, which corresponds to taking a
linear combination of the learners. Voting is an ensemble machine learning algorithm.

Error-Correcting Output Codes

• In Error-Correcting Output Codes main classification task is defined in terms of a number of
subtasks that are implemented by the base-learners. The idea is that the original task of
separating one class from all other classes may be a difficult problem.

• Voting scheme are

yi = Σtj=1 Wijdj
and then we choose the class with the highest Yi.

• One problem with ECOC is that because the code matrix W is set a priori, there is Porno
guarantee that the subtasks as defined by the columns of W will be simple.

2. Apply Ensemble Learning to real-world classification and regression tasks,

leveraging the predictive power of multiple models. [CO3,K3]

Ensemble Learning

• The idea of ensemble learning is to employ multiple learners and combine their predictions.
If we have a committee of M models with uncorrelated errors, simply by averaging them the
average error of a model can be reduced by a factor of M.

• Based on one of two basic observations :

1. Variance reduction: If the training sets are completely independent, it will always helps to
average an ensemble because this will reduce variance without affecting bias (e.g., bagging)
and reduce sensitivity to individual data points.
2. Bias reduction: For simple models, average of models has much greater capacity than
single model Averaging models can reduce bias substantially by increasing capacity and
control variance by Citting one component at a time.
Bagging
• Bagging is also called Bootstrap aggregating. Bagging and boosting are meta - algorithms
that pool decisions from multiple classifiers. It creates ensembles feed by repeatedly randomly
resampling the training data.

Pseudocode:
1. Given training data (x1, y1), ..., (xm, Ym)
2. For t = 1,..., T:
a. Form bootstrap replicate dataset St by selecting m random examples from the training set
with replacement.
b. Let ht be the result of training base learning algorithm on St.
3. Output combined classifier:
H(x) = majority (h1(x), ..., hT (x)).

Boosting
• Boosting is a very different method to generate multiple predictions (function mob estimates)
and combine them linearly. Boosting refers to a general and provably effective method of
producing a very accurate classifier by combining rough and moderately inaccurate rules of
thumb.

AdaBoost:
• AdaBoost, short for "Adaptive Boosting", is a machine learning meta - algorithm formulated
by Yoav Freund and Robert Schapire who won the prestigious "Gödel Prize" in 2003 for their
work. It can be used in conjunction with many other types of learning algorithms to improve
their performance.

Stacking

• Stacking, sometimes called stacked generalization, is an ensemble machine learning method

that combines multiple heterogeneous base or component models via a meta-model.

3. Use clustering in real-world scenarios for tasks like customer segmentation, image
segmentation, or document categorization. [CO3,K3]
The output from a clustering algorithm is basically a statistical description of the cluster
centroids with the number of components in each cluster.
• Cluster centroid: The centroid of a cluster is a point whose parameter values are the mean
of the parameter values of all the points in the cluster. Each cluster has a well defined centroid.

• Distance: The distance between two points is taken as a common metric to as see the
similarity among the components of population. The commonly used distance measure is the
euclidean metric which defines the distance between two points

p= (P1, P2,...) and q = (q1,q2,...) is given by,

d = Σ ki=1 (pi - qi)2
K-Means Algorithm Properties
1. There are always K clusters.
2. There is always at least one item in each cluster.
3. The clusters are non-hierarchical and they do not overlap.
4. Every member of a cluster is closer to its cluster than any other cluster because closeness
does not always involve the 'center' of clusters.
The K-Means Algorithm Process
1. The dataset is partitioned into K clusters and the data points are randomly assigned to the
clusters resulting in clusters that have roughly the same number of data points.
2. For each data point.
a. Calculate the distance from the data point to each cluster.
b. If the data point is closest to its own cluster, leave it where it is.
c. If the data point is not closest to its own cluster, move it into the closest cluster.
3. Repeat the above step until a complete pass through all the data points results in no data
point moving from one cluster to another. At this point the clusters are stable and the clustering
process ends.
4. The choice of initial partition can greatly affect the final clusters that result, in terms of inter-
cluster and intracluster distances and cohesion.
• K-means algorithm is iterative in nature. It converges, however only a local minimum is
obtained. It works only for numerical data. This method easy to implement.

• Advantages of K-Means Algorithm:

1. Efficient in computation
2. Easy to implement.
• Weaknesses

1. Applicable only when mean is defined.

2. Need to specify K, the number of clusters, in advance.
3. Trouble with noisy data and outliers.

4. Not suitable to discover clusters with non-convex shapes.

4. Explain Instance Based Learning: KNN(K-Nearest Neighbour) in detail. [CO3,K2]

K-Nearest Neighbour is one of the only Machine Learning algorithms based totally on
supervised learning approach. K-NN algorithm assumes the similarity between the brand new
case/facts and available instances

Instance Based Learning: KNN

• K-Nearest Neighbour is one of the only Machine Learning algorithms based totally on
supervised learning approach.

• K-NN algorithm assumes the similarity between the brand new case/facts and available
instances and placed the brand new case into the category that is maximum similar to the to be
had classes.

• K-NN set of rules shops all of the to be had facts and classifies a new statistics point based
at the similarity. This means when new data seems then it may be effortlessly categorised into
a properly suite class by using K-NN algorithm.

• K-NN set of rules can be used for regression as well as for classification however normally
it's miles used for the classification troubles.
Why Do We Need KNN?
• Suppose there are two categories, i.e., category A and category B and we've a brand new
statistics point x1, so this fact point will lie within of these classes. To solve this sort of
problem, we need a K-NN set of rules. With the help of K-NN, we will without difficulty
discover the category or class of a selected dataset. Consider the underneath diagram:

How Does KNN Work ?

• The K-NN working can be explained on the basis of the below algorithm:

Step 1: Select the wide variety K of the acquaintances.

Step 2: Calculate the Euclidean distance of K variety of friends.
Step 3: Take the K nearest neighbors as according to the calculated Euclidean distance.
Step 4: Among these ok pals, count number the number of the data points in each class.
Step 5: Assign the brand new records points to that category for which the quantity of the
neighbor is maximum.
Step 6: Our model is ready.

5. Analyze the role of parameters in GMM, including the means, covariances, and
weights of the individual Gaussian components. [CO3,K4]

Gaussian Mixture Models is a "soft" clustering algorithm, where each point probabilistically
"belongs" to all clusters. This is different than k-means where each point belongs to one
cluster.

Gaussian Mixture Models and Expectation Maximization

• Gaussian Mixture Models is a "soft" clustering algorithm, where each point probabilistically
"belongs" to all clusters. This is different than k-means where each point belongs to one cluster.
• The Gaussian mixture model is a probabilistic model that assumes all the data points are
generated from a mix of Gaussian distributions with unknown parameters.
• Gaussian mixture models do not rigidly classify each and every instance into one class or the
other. The algorithm attempts to produce K-Gaussian distributions that would take into account
the entire training space. Every point can be associated with one or more distributions.
Consequently, the deterministic factor would be the probability that each point belongs to a
certain Gaussian distribution.
• GMMs have a variety of real-world applications. Some of them are listed below.
a) Used for signal processing
b) Used for customer churn analysis
c) Used for language identification
d) Used in video game industry
e) Genre classification of songs

6. Expectation Maximization in detail. [CO3,K2]

Expectation-maximization

• Expectation is used to find the Gaussian parameters which are used to represent each component
of gaussian mixture models. Maximization is termed M and it is involved in determining whether
new data points can be added or not.

• The Expectation-Maximization (EM) algorithm is used in maximum likelihood estimation where

the problem involves two sets of random variables of which one, X, is observable and the other,
Z, is hidden.

• The goal of the algorithm is to find the parameter vector that maximizes the (gpie-likelihood of
the observed values of X, L( ϕ| X).

• But in cases where this is not feasible, we associate the extra hidden variables Z and express the
underlying model using both, to maximize the likelihood of the joint distribution of X and Z, the
complete likelihood Lc ( | X,Z).

• Expectation-maximization (EM) is an iterative method used to find maximum likelihood

estimates of parameters in probabilistic models, where the model depends on unobserved, also
called latent, variables.

• EM alternates between performing an expectation (E) step, which computes an NOV expectation
of the likelihood by including the latent variables as if they were observed, and maximization (M)
step, which computes the maximum likelihood estimates of the parameters by maximizing the
expected likelihood found in the E step.
• Expectation-Maximization (EM) is a technique used in point estimation. Given a set of
observable variables X and unknown (latent) variables Z we want to estimate parameters Ѳ in a
model.

• EM is useful for several reasons: conceptual simplicity, ease of implementation, and the fact that
each iteration improves 1(0). The rate of convergence on the first few steps is typically quite good,
but can become excruciatingly slow as you approach local optima.

7. Explain the concept of Ensemble Learning, highlighting the model combination schemes.
Discuss the advantages and limitations of Voting as a model combination scheme.
[CO3,K2]

Answer: Ensemble Learning involves combining multiple models to improve overall

performance. Model combination schemes include Voting, Bagging, Boosting, and Stacking.
Voting aggregates predictions, and it can be Hard Voting (majority vote) or Soft Voting
(weighted average of probabilities).

Advantages of Voting:

• Simple to implement.
• Effective when combining diverse models.

Limitations of Voting:

• Assumes equal competence of all models.

• May not perform well if models are highly correlated.

8.Compare and contrast the Bagging and Boosting ensemble techniques. Provide insights
into scenarios where each technique excels. [CO3,K4]

Answer: Bagging (Bootstrap Aggregating) and Boosting are both ensemble techniques.

Bagging:

• Builds independent models in parallel.

• Reduces variance and overfitting.
• Suitable for unstable models.

Boosting:

• Builds models sequentially, giving more weight to misclassified instances.

• Reduces bias and focuses on difficult instances.
• Suitable for improving the performance of weak learners.

Scenarios:

• Bagging is effective when models are unstable or prone to overfitting.

• Boosting is beneficial when dealing with weak learners or when there is a need for higher
accuracy.

9.Explore the principles behind Stacking in ensemble learning. Discuss the process of
model stacking and the advantages it offers. [CO3,K4]

Answer: Stacking involves training multiple models, and a meta-model is trained to combine
their predictions. The process includes:

1. Training diverse base models on the same data.

2. Collecting predictions from base models.
3. Using the predictions as features to train a meta-model.

Advantages of Stacking:

• Captures diverse perspectives from base models.

• Adapts to complex patterns in data.
• Can outperform individual models and traditional ensembles.

10.Discuss the K-means clustering algorithm in unsupervised learning. Explain the steps
involved, including initialization and convergence. Highlight potential challenges in using
K-means. [CO3,K6]

Answer: K-means is a clustering algorithm:

1. Initialization: Select k centroids (initial cluster centers).

2. Assignment: Assign each data point to the nearest centroid, forming k clusters.
3. Update Centroids: Recalculate centroids based on the mean of data points in each cluster.
4. Repeat Assignment and Update: Iterate until convergence (no change in assignments).

Challenges:

• Sensitive to initial centroid selection.

• Assumes clusters are spherical and equally sized.
• May converge to local optima.

Apress Understanding Large Language Models B0CJ2C8TXQ
100% (11)
Apress Understanding Large Language Models B0CJ2C8TXQ
166 pages
Whitepaper - Foundational Large Language Models & Text Generation
100% (1)
Whitepaper - Foundational Large Language Models & Text Generation
75 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
AIML unit 4
No ratings yet
AIML unit 4
26 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
Answer 2022-23
No ratings yet
Answer 2022-23
22 pages
AI_UNIT_4
No ratings yet
AI_UNIT_4
26 pages
Machine Learning Unit3
No ratings yet
Machine Learning Unit3
26 pages
Unit 4
No ratings yet
Unit 4
17 pages
Al3451 Ml - Questionbank -3,4,5
No ratings yet
Al3451 Ml - Questionbank -3,4,5
11 pages
AL3451 - QUESTION BANK
100% (1)
AL3451 - QUESTION BANK
12 pages
Cs3491 Aiml Unit 4 Qbank
No ratings yet
Cs3491 Aiml Unit 4 Qbank
27 pages
2 Marks Adobe Scan 20-Mar-2024
No ratings yet
2 Marks Adobe Scan 20-Mar-2024
2 pages
Machine Learning Qs
No ratings yet
Machine Learning Qs
10 pages
M4 - FDS
No ratings yet
M4 - FDS
15 pages
Ai ML Unit 4 Notes
No ratings yet
Ai ML Unit 4 Notes
42 pages
UNIT-4NEW
No ratings yet
UNIT-4NEW
39 pages
SEM MLOps
No ratings yet
SEM MLOps
58 pages
Sem Rpa
No ratings yet
Sem Rpa
61 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Unit-5 Rel
No ratings yet
Unit-5 Rel
5 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
ML_Questions_Answers
No ratings yet
ML_Questions_Answers
4 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Lecture 3 Mcqs
No ratings yet
Lecture 3 Mcqs
7 pages
unit 4
No ratings yet
unit 4
45 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
Machine Learning Midterm
No ratings yet
Machine Learning Midterm
18 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
Interview Questions
100% (1)
Interview Questions
67 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
ML Module 5 2022 PDF
100% (2)
ML Module 5 2022 PDF
31 pages
Answer 2023-24
No ratings yet
Answer 2023-24
19 pages
Unit 4
No ratings yet
Unit 4
24 pages
20cs0535-Machine Learning
No ratings yet
20cs0535-Machine Learning
5 pages
DUnit I
No ratings yet
DUnit I
25 pages
MLT
No ratings yet
MLT
32 pages
Artificial Intelligence Chapter 18 (Updated)
No ratings yet
Artificial Intelligence Chapter 18 (Updated)
19 pages
MLANS
No ratings yet
MLANS
26 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
There Are Key Areas in The Process of Machine Learning, Like
No ratings yet
There Are Key Areas in The Process of Machine Learning, Like
45 pages
20CS0535-ML QB
No ratings yet
20CS0535-ML QB
5 pages
Unit 1
No ratings yet
Unit 1
20 pages
ML-UNIT-I
No ratings yet
ML-UNIT-I
14 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Asign-3 DWDM
No ratings yet
Asign-3 DWDM
27 pages
statistic inference unit 2 notes
No ratings yet
statistic inference unit 2 notes
34 pages
Machine Learning Interview Questions & Answers - MIQ
No ratings yet
Machine Learning Interview Questions & Answers - MIQ
17 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
2m
No ratings yet
2m
4 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Unit 5
No ratings yet
Unit 5
77 pages
ML Merge
No ratings yet
ML Merge
145 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
Solved With ChatGPT
No ratings yet
Solved With ChatGPT
3 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Escaping The Big Data Paradigm With Compact Transformers
No ratings yet
Escaping The Big Data Paradigm With Compact Transformers
18 pages
Question Bank of Advanced Dbms
No ratings yet
Question Bank of Advanced Dbms
2 pages
1.2.1. Machine Learning
No ratings yet
1.2.1. Machine Learning
2 pages
Learning Process: CS/CMPE 537 - Neural Networks
No ratings yet
Learning Process: CS/CMPE 537 - Neural Networks
34 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
5 pages
Lecture 4-Machine Learning Applications
No ratings yet
Lecture 4-Machine Learning Applications
52 pages
ML_notion_1
No ratings yet
ML_notion_1
18 pages
References PHD Work
No ratings yet
References PHD Work
2 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
38 pages
2 Comparative Performance Study of DBN LSTM CNN and SAE Models For Wind Speed and Direction Forecasting 2
No ratings yet
2 Comparative Performance Study of DBN LSTM CNN and SAE Models For Wind Speed and Direction Forecasting 2
5 pages
Student Advising With Artificial Intelligence: Supervised By:dr - Amani Abdo
No ratings yet
Student Advising With Artificial Intelligence: Supervised By:dr - Amani Abdo
22 pages
Generative Ai With Python Harnessing the Power of Machine Learning and Deep Learning to Build Creative and Intelligent Systems
100% (1)
Generative Ai With Python Harnessing the Power of Machine Learning and Deep Learning to Build Creative and Intelligent Systems
239 pages
Hate Speech Detection Using Lstm and NLp Sushan Pratihar 3 Page
No ratings yet
Hate Speech Detection Using Lstm and NLp Sushan Pratihar 3 Page
13 pages
Recent Advances in Artificial Intelligence: A Literature Review
No ratings yet
Recent Advances in Artificial Intelligence: A Literature Review
2 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
52 pages
Medical Plant Identification Project Report
No ratings yet
Medical Plant Identification Project Report
67 pages
Simply AI
No ratings yet
Simply AI
14 pages
COMPUVISION
No ratings yet
COMPUVISION
27 pages
Image2tweet: Datasets in Hindi and English For Generating Tweets From Images
No ratings yet
Image2tweet: Datasets in Hindi and English For Generating Tweets From Images
7 pages
DL Modules
No ratings yet
DL Modules
1 page
Chord Detection Using Deep Learning
No ratings yet
Chord Detection Using Deep Learning
8 pages
ChatGPT As A Mapping Assistant
No ratings yet
ChatGPT As A Mapping Assistant
13 pages
Automatic Speech Recognition Post-Processing For Readability Task Dataset and A Two-Stage Pre-Trained Approach
No ratings yet
Automatic Speech Recognition Post-Processing For Readability Task Dataset and A Two-Stage Pre-Trained Approach
14 pages
Binary Classification Tutorial With The Keras Deep Learning Library
No ratings yet
Binary Classification Tutorial With The Keras Deep Learning Library
33 pages
Activation Function - Lect 1
No ratings yet
Activation Function - Lect 1
5 pages
Cloud
No ratings yet
Cloud
20 pages
Project Based Learning: Predicting Bitcoin Prices Using Deep Learning
No ratings yet
Project Based Learning: Predicting Bitcoin Prices Using Deep Learning
24 pages
Multilayer Neural System
No ratings yet
Multilayer Neural System
14 pages

MLQB Unit 3

Uploaded by

MLQB Unit 3

Uploaded by

UNIT – 3

ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING

Q.1 Define unsupervised learning. [CO3,K1]

11. Define the Expectation-Maximization (EM) algorithm in the context of Gaussian

• Answer: The EM algorithm is an iterative method for estimating parameters in GMMs. It

12. What is the primary objective of voting in ensemble learning? [CO3,K1]

18. In KNN, how is the class of a new instance determined? [CO3,K1]

20. What is the significance of the Expectation-Maximization (EM) algorithm in Gaussian

1. Explain Combining Multiple Learners in detail. [CO3,K2]

Combining Multiple Learners

1. Generating Diverse Learners:

• Different Input Representations: Different representations make different characteristics

• Different Training Sets: Another possibility is to train different base-learners by different

Model Combination Schemes

Error-Correcting Output Codes

• Voting scheme are

2. Apply Ensemble Learning to real-world classification and regression tasks,

• Based on one of two basic observations :

• Stacking, sometimes called stacked generalization, is an ensemble machine learning method

p= (P1, P2,...) and q = (q1,q2,...) is given by,

• Advantages of K-Means Algorithm:

1. Applicable only when mean is defined.

4. Not suitable to discover clusters with non-convex shapes.

4. Explain Instance Based Learning: KNN(K-Nearest Neighbour) in detail. [CO3,K2]

Instance Based Learning: KNN

How Does KNN Work ?

Step 1: Select the wide variety K of the acquaintances.

Gaussian Mixture Models and Expectation Maximization

6. Expectation Maximization in detail. [CO3,K2]

• The Expectation-Maximization (EM) algorithm is used in maximum likelihood estimation where

• Expectation-maximization (EM) is an iterative method used to find maximum likelihood

Answer: Ensemble Learning involves combining multiple models to improve overall

• Assumes equal competence of all models.

• Builds independent models in parallel.

• Builds models sequentially, giving more weight to misclassified instances.

• Bagging is effective when models are unstable or prone to overfitting.

1. Training diverse base models on the same data.

• Captures diverse perspectives from base models.

Answer: K-means is a clustering algorithm:

1. Initialization: Select k centroids (initial cluster centers).

• Sensitive to initial centroid selection.

You might also like