Aam Ut-1 Qb Ans- [Final]
Aam Ut-1 Qb Ans- [Final]
(2 Marks Questions):
Types of Scaling:
Medicine: With the help of this algorithm, disease trends and risks of the
disease can be identified.
Land Use: We can identify the areas of similar land use by this algorithm.
There is no particular way to determine the best value for "K", so we need to try
some values to find the best out of them.
A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.
Large values for K are good, but it may find some difficulties.
Elbow Method:
Test different values of K and choose the one where the error rate stabilizes
or decreases marginally.
Cross-validation:
Advantages:
Disadvantages:
2. Difficult to choose the correct kernel function: Selecting the right kernel
(e.g., linear, polynomial, RBF) and tuning the associated parameters (like C
and gamma) requires expertise and can be challenging.
3. Not suitable for noisy data: SVM may not perform well when the data
contains a lot of noise or overlapping classes.
Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM
classifier.
Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into sub-
nodes according to the given conditions.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other
nodes are called the child nodes.
Q.7) State any TWO advantages of KNN algorithm
Handles noisy data well: It’s good at ignoring errors or outliers in data.
Scales to large datasets: Works well with lots of data, though it can slow
down with huge datasets.
Handles many features: Works well with datasets that have many different
characteristics or variables.
(4 Marks Questions):
It involves selecting relevant information from raw data and transforming it into a
format that can be easily understood by a model.
The goal is to improve model accuracy by providing more meaningful and relevant
information. The process of feature engineering is as given below:
Feature Selection: Choose the most important features using methods like
correlation, mutual information, or Recursive Feature Elimination (RFE).
Step-1: Begin the tree with the root node, says S, which contains the complete
dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).
Step-3: Divide the S into subsets that contains possible values for the best
attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3. Continue this process until a stage is reached where you cannot
further classify the nodes and called the final node as a leaf node.
ASM is a technique used for the selecting best attribute for discrimination among
tuples.
It gives rank to each attribute and the best attribute is selected as splitting criterion.
1. Information Gain:
According to the value of information gain, we split the node and build the
decision tree.
2. Gini Index: Gini Index aims to decrease the impurities from the root nodes
(at the top of decision tree) to the leaf nodes of a decision tree model.
Example:
Suppose there is a candidate who has a job offer and wants to decide whether he
should accept the offer or Not. So, to solve this problem, the decision tree starts
with the root node (Salary attribute by ASM).
Q.3) With suitable example, explain how Naïve Bayes Theorem is applied
Below is a training data set of weather and corresponding target variable ‘Play’
(suggesting possibilities of playing).
Now, we need to classify whether players will play or not based on weather
condition.
Problem: Players will play if the weather is sunny. Is this statement correct?
We can solve it using the above-discussed method of posterior Probability.
Dataset: [ S– Sunny O– Overcast R– Rainy (N– No Y – Yes) ]
Step 2: Interpretation
2. Boosting:
Models are trained sequentially, with each new model correcting the
errors of the previous one.
Boosting gives higher weights to misclassified instances to improve
performance.
Example: AdaBoost, Gradient Boosting, XGBoost.
3. Stacking:
Uses multiple base models and combines their outputs using a meta-
learner (a higher-level model).
The meta-learner learns how to best combine the base models’
predictions.
Example: Combining Decision Trees, SVM, and Neural Networks.
Advantages:
Increases model accuracy and reduces overfitting.
Works well with both classification and regression tasks.
Reduces variance and improves model robustness.
Q.5) Consider following training dataset of weather, apply Naive Bayes
Below is a training data set of weather and corresponding target variable ‘Play’
(suggesting possibilities of playing).
Now, we need to classify whether players will play or not based on weather
condition.
Problem: Players will play if the weather is sunny. Is this statement correct?
We can solve it using the above-discussed method of posterior Probability.
Dataset: [ S– Sunny O– Overcast R– Rainy (N– No Y – Yes) ]
Step 2: Interpretation
Feature selection:
Feature selection is a process that chooses a subset of features from the original
features so that the feature space is optimally reduced according to a certain
criterion.
The goal is to reduce the dimensionality of the dataset while retaining the most
important features.
There are several methods for feature selection, including:
Filter Methods
Wrapper Methods
Embedded Methods.
Filter Methods:
• These methods are generally used while doing the pre-processing step.
• These methods select features from the dataset irrespective of the use of any
machine learning algorithm
• In terms of computation, they are very fast and inexpensive and are very
good for removing duplicated, correlated, redundant features
• Selection of feature is evaluated individually which can sometimes help
when features are in isolation (don’t have a dependency on other features)
but will lag when a combination of features can lead to increase in the
overall performance of the model.
Wrapper Methods:
They are computationally expensive but often provide better results than
filter methods.
Since they rely on the actual learning algorithm, they can be slower but more
accurate.
Embedded methods
• In embedded methods, the feature selection algorithm is blended as part of
the learning algorithm, thus having its own built-in feature selection
methods.
• Embedded methods encounter the drawbacks of filter and wrapper methods
and merge their advantages.
• These methods are faster like those of filter methods and more accurate than
the filter methods and take into consideration a combination of features as
well.
Q.8) Explain Random Forest Algorithm In Detail.
Random forest is a supervised learning technique.
It predicts output with high accuracy, even for the large dataset it runs
efficiently.
3. It enhances the accuracy of the model and prevents the overfitting issue.
K-NN algorithm assumes the similarity between the new case/data and available
cases and put the new case into the category that is most similar to the available
categories.
K-NN algorithm stores all the available data and classifies a new data point based
on the similarity.
This means when new data appears then it can be easily classified into a well suite
category by using K- NN algorithm.
K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
Working of K-NN
Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms.
It is used for Classification as well as Regression problems. However, primarily, it
is used for Classification problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that
can segregate n-dimensional space into classes so that we can easily put the new
data point in the correct category in the future.
This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane.
These extreme cases are called as support vectors, and hence algorithm is termed
as Support Vector Machine.
SVM algorithm can be used for Face detection, image classification, text
categorization, etc.
Hyperplane and Support Vectors in the SVM algorithm:
Hyperplane:
Support Vectors:
The data points or vectors that are the closest to the hyperplane and which affect
the position of the hyperplane are termed as Support Vector.
Since these vectors support the hyperplane, hence called a Support vector.
Types of SVM
Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM
classifier.
Extras:
Feature Selection Diagram: