0% found this document useful (0 votes)

4 views

AAM UT-1 QB ANS

The document covers various machine learning concepts including feature scaling, applications of random forest, K-nearest neighbors, support vector machines, decision trees, and ensemble learning methods. It explains the processes of feature engineering, decision tree algorithm, Bayes theorem, and Naive Bayes classification. Additionally, it discusses advantages and disadvantages of different algorithms and provides examples for better understanding.

Uploaded by

Ritika Darade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

AAM UT-1 QB ANS

Uploaded by

Ritika Darade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

AAM UT-1 QB ANS

(2 Marks Questions):

Q.1) What is feature scaling?

Feature Scaling is a technique to standardize the independent features present in

the data in a fixed range.

It is performed during the data pre-processing to handle highly varying magnitudes

or values or units.

Q.2) State any four applications of random forest.


 Banking: Banking sector mostly uses this algorithm for the identification of
loan risk.

 Medicine: With the help of this algorithm, disease trends and risks of the
disease can be identified.

 Land Use: We can identify the areas of similar land use by this algorithm.

 Marketing: Marketing trends can be identified using this algorithm.

Q.3) How to select value of ‘K’ in K-nearest neighbor Algorithm?
(Mention two methods)

There is no particular way to determine the best value for "K", so we need to try
some values to find the best out of them.

 The most preferred value for K is 5.

 A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.

 Large values for K are good, but it may find some difficulties.

The two methods used are:

 Elbow Method: Test different values of K and choose the one where the
error rate stabilizes or decreases marginally.

 Cross-validation: Use techniques like k-fold cross-validation to determine

the best K value by evaluating performance on different subsets of data.
Q.4) List any two advantages and disadvantages of support vector machine

Advantages:

1. Effective in high-dimensional spaces: SVM is capable of handling data with a large

number of features, making it suitable for high-dimensional datasets.

2. Works well with a clear margin of separation: SVM performs particularly well when
there is a distinct gap between classes, as it focuses on maximizing the margin between
them.

3. Robust to overfitting: Especially in high-dimensional spaces, SVM is less prone to

overfitting when used with appropriate regularization.

4. Versatile with different kernels: SVM can be adapted to various types of data,
including non-linear ones, by using kernel tricks, which help in mapping the data to
higher-dimensional spaces.

Disadvantages:

1. Computationally expensive for large datasets: Training an SVM can be slow and
memory-intensive for large datasets due to its complexity, especially when using non-
linear kernels.

2. Difficult to choose the correct kernel function: Selecting the right kernel (e.g., linear,
polynomial, RBF) and tuning the associated parameters (like C and gamma) requires
expertise and can be challenging.

3. Not suitable for noisy data: SVM may not perform well when the data contains a lot of
noise or overlapping classes.

4. Requires careful tuning: SVM has several hyperparameters (like the regularization
parameter, kernel type, and margin) that need to be fine-tuned for optimal performance,
which can be time-consuming.
Q.5) Explain the types of support vector machines

There are 2 types of SVM given below,

Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.

Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which
means if a dataset cannot be classified by using a straight line, then such data is
termed as non-linear data and classifier used is called as Non-linear SVM
classifier.

Q.6) Enlist any FOUR decision tree terminology.

Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.

Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.

Splitting: Splitting is the process of dividing the decision node/root node into sub-
nodes according to the given conditions.

Branch/Sub Tree: A tree formed by splitting the tree.

Pruning: Pruning is the process of removing the unwanted branches from the tree.

Parent/Child node: The root node of the tree is called the parent node, and other
nodes are called the child nodes.
Q.7) State any TWO advantages of KNN algorithm

 Easy to use: SVM is simple to implement with available tools.

 Handles noisy data well: It’s good at ignoring errors or outliers in data.

 Scales to large datasets: Works well with lots of data, though it can slow
down with huge datasets.

 Versatile: Can be used for both classification and predicting continuous

values.

 Handles many features: Works well with datasets that have many different
characteristics or variables.
(4 Marks Questions):

Q.1) Describe process of feature engineering.

Feature Engineering is the process of creating new features or transforming

existing features to improve the performance of a machine-learning model.

It involves selecting relevant information from raw data and transforming it into a
format that can be easily understood by a model.

The goal is to improve model accuracy by providing more meaningful and relevant
information.

The process of feature engineering is as given below:

 Feature Extraction: Identify relevant variables from raw data (e.g.,

extracting text length from a document).

 Feature Selection: Choose the most important features using methods like
correlation, mutual information, or Recursive Feature Elimination (RFE).

 Feature Transformation: Modify features using scaling (Normalization,

Standardization), encoding categorical variables (One-Hot Encoding, Label
Encoding), or creating polynomial features.

 Feature Creation: Generate new meaningful features, such as time-based

features from timestamps or domain-specific features.

 Handling Missing Values: Impute missing data using mean, median, or

predictive modeling.

 Feature Reduction: Reduce dimensionality using PCA (Principal

Component Analysis) or LDA (Linear Discriminant Analysis).
Q.2) Demonstrate Working of Decision Tree Algorithm with Attribute
Selection Measures

Decision Tree is a Supervised learning technique that can be used for both
classification and Regression problems.

It is a tree-structured classifier, where internal nodes represent the features of a

dataset, branches represent the decision rules and each leaf node represents the
outcome.

Steps to build a Decision Tree:

1. Calculate the Entropy for the dataset:

2. Compute Information Gain for each feature:

The feature with the highest Information Gain (IG) is chosen for the first split.

3. Repeat the splitting process:

o Compute entropy and IG for remaining features.

o Continue splitting until stopping criteria (e.g., pure nodes, max depth) are met.

4. Make Predictions:

o Traverse the tree using feature values until a leaf node is reached.
o Assign the label of the leaf node as the prediction.

Example:
If we have a dataset with features Weather (Sunny, Rainy, Overcast) and a target Play
(Yes/No), we compute IG for Weather and split the tree accordingly.
Q.3) with the suitable example, explain how Bayes Theorem is applied

Bayes’ Theorem:

P (A ∣ B) = Is the conditional probability of event A occurring, given that B is true.

P (B ∣ A) = Is the conditional probability of event B occurring, given that A is true.

P (A) and P (B) = Are the probabilities of A and B occurring independently of one
another.
Example of Bayes Theorem:

Three bags contain 6 red, 4 black; 4 red, 6 black, and 5 red, 5 black balls
respectively. One of the bag is selected at random and a ball is drawn from it. If the
ball drawn is red, find the probability that it is drawn from the first bag.

Solution:
Let E1, E2, E3, and A be the events defined as follows:

E1 = bag first is chosen, E2 = bag second is chosen, E3 = bag third is chosen,

A = ball drawn is red

Each bag is equally likely to be chosen:

Step 1: Probabilities of Drawing a Red Ball from Each Bag:

Step 2: Find P (A) (Total Probability of Drawing Red):

Step 3: Apply Bayes’ Theorem

Thus, the probability that the red ball was drawn from the first bag is 2/5 or 40%
Q.4) Describe types of ensemble learning methods.

Ensemble learning is a machine learning technique where multiple models (weak

learners) are combined to improve overall prediction accuracy and robustness.

Instead of relying on a single model, ensemble methods aggregate the predictions

of multiple models to reduce variance, bias, and improve generalization.

Types of Ensemble Learning:

1. Bagging (Bootstrap Aggregating):

 Multiple models are trained on different subsets of the dataset using

bootstrapping (sampling with replacement).
 Final prediction is obtained by averaging (for regression) or majority voting
(for classification).
 Example: Random Forest (an ensemble of decision trees).

2. Boosting:

 Models are trained sequentially, with each new model correcting the errors of the
previous one.
 Boosting gives higher weights to misclassified instances to improve performance.
 Example: AdaBoost, Gradient Boosting, XGBoost.

3. Stacking:

 Uses multiple base models and combines their outputs using a meta-learner (a
higher-level model).
 The meta-learner learns how to best combine the base models’ predictions.
 Example: Combining Decision Trees, SVM, and Neural Networks.

4. Voting & Averaging:

 Aggregates predictions from multiple models using majority voting (for

classification) or averaging (for regression).
 Example: Using Logistic Regression, KNN, and SVM together to make a final
decision.

Advantages:
 Increases model accuracy and reduces overfitting.
 Works well with both classification and regression tasks.
 Reduces variance and improves model robustness.
Q.5) Consider following training dataset of weather, apply Naive Bayes
Below is a training data set of weather and corresponding target variable ‘Play’
(suggesting possibilities of playing).
Now, we need to classify whether players will play or not based on weather
condition.
Problem: Players will play if the weather is sunny. Is this statement correct?
We can solve it using the above-discussed method of posterior Probability.
Dataset:

Step 1: Frequency Table (Counts for Weather & Play):

Step 2: Likelihood Table (Conditional Probabilities):

Given data:

Step 1: Apply Bayes’ Theorem

Step 2: Interpretation

Since (which is higher probability), players are

more likely to play when the weather is sunny.

Conclusion: The statement is likely correct, but not always certain.

Puzzles Predicaments and Perplexities III v1.1
100% (8)
Puzzles Predicaments and Perplexities III v1.1
42 pages
Coincent - Data Science With Python Assignment
100% (2)
Coincent - Data Science With Python Assignment
23 pages
Aam Ut-1 Qb Ans [Final]
No ratings yet
Aam Ut-1 Qb Ans [Final]
26 pages
Aam Ut-1 Qb Ans- [Final]
No ratings yet
Aam Ut-1 Qb Ans- [Final]
28 pages
ABDULLAH SAAD MACHINE LEARNING ASSIGNMENT 01
No ratings yet
ABDULLAH SAAD MACHINE LEARNING ASSIGNMENT 01
15 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
CS-3035 (ML) - CS End April 2024
No ratings yet
CS-3035 (ML) - CS End April 2024
21 pages
Unit IV Naïve Bayes and Support Vector Machine
No ratings yet
Unit IV Naïve Bayes and Support Vector Machine
22 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
unit 6 ai
No ratings yet
unit 6 ai
28 pages
data_mining_end_23_24
No ratings yet
data_mining_end_23_24
2 pages
Advantages:: Q.No 1.a Ans
No ratings yet
Advantages:: Q.No 1.a Ans
12 pages
ML imppp (1)
No ratings yet
ML imppp (1)
12 pages
Important Questions
No ratings yet
Important Questions
18 pages
DSM_MOd_5
No ratings yet
DSM_MOd_5
34 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
13.QUESTION BANK
No ratings yet
13.QUESTION BANK
4 pages
Pa ZG512 Ec-3r First Sem 2022-2023
No ratings yet
Pa ZG512 Ec-3r First Sem 2022-2023
5 pages
Machine learning assingiment
No ratings yet
Machine learning assingiment
20 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
Assignment 6 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 6 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
10 pages
Machine_Learning_One_Mark_Answers
No ratings yet
Machine_Learning_One_Mark_Answers
4 pages
MLT Notes
No ratings yet
MLT Notes
2 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
6 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
2022 ML Assignments
No ratings yet
2022 ML Assignments
45 pages
Deep Learning Techniques
No ratings yet
Deep Learning Techniques
65 pages
Machine Learning Solutions
No ratings yet
Machine Learning Solutions
6 pages
ASSIGNMENT2
No ratings yet
ASSIGNMENT2
6 pages
Interview Questions
No ratings yet
Interview Questions
8 pages
Data Science Interview Quesions
No ratings yet
Data Science Interview Quesions
22 pages
Machine Learning Techniques: Important Questions Unit-1
No ratings yet
Machine Learning Techniques: Important Questions Unit-1
8 pages
ML_Questions_Answers
No ratings yet
ML_Questions_Answers
4 pages
FMLanswerkey-IT 2.docx (1) (1) (1)
No ratings yet
FMLanswerkey-IT 2.docx (1) (1) (1)
11 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
210 Handout
No ratings yet
210 Handout
45 pages
examBD2223 January Solutions
No ratings yet
examBD2223 January Solutions
7 pages
Assessing a Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing a Single Classification Algorithm and Two Classification Algorithms
12 pages
Divorce Prediction System: Devansh Kapoor 179202050
No ratings yet
Divorce Prediction System: Devansh Kapoor 179202050
12 pages
Types of Kernels in Support Vector Machines
No ratings yet
Types of Kernels in Support Vector Machines
14 pages
End SEM V IMP DSE 2
No ratings yet
End SEM V IMP DSE 2
9 pages
Unit-7 ML
No ratings yet
Unit-7 ML
11 pages
SVM Assignment ABA Course To Be Returned With Your Answers
No ratings yet
SVM Assignment ABA Course To Be Returned With Your Answers
10 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
ML QA
No ratings yet
ML QA
10 pages
Comparative Study
No ratings yet
Comparative Study
17 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
DS Unit - 4
No ratings yet
DS Unit - 4
76 pages
finals19
No ratings yet
finals19
16 pages
Supervised Classification Notes
No ratings yet
Supervised Classification Notes
31 pages
IML-IITKGP - Assignment 5 Solution
No ratings yet
IML-IITKGP - Assignment 5 Solution
7 pages
Machine Learning For Interviews
No ratings yet
Machine Learning For Interviews
12 pages
Question Bank
No ratings yet
Question Bank
5 pages
ML Techniques and Concepts
No ratings yet
ML Techniques and Concepts
48 pages
Assignment Data Mining
No ratings yet
Assignment Data Mining
27 pages
Ôn Thi KTDL
No ratings yet
Ôn Thi KTDL
18 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Ict p2 Practical Questions
No ratings yet
Ict p2 Practical Questions
5 pages
2014-2015 4th Grade Curriculum Description
No ratings yet
2014-2015 4th Grade Curriculum Description
1 page
Talash CV
No ratings yet
Talash CV
2 pages
Satguru Kabir: Al-Kabir ("The Great") Is Also One of The in Islam. For A Complete Disambiguation Page, See
No ratings yet
Satguru Kabir: Al-Kabir ("The Great") Is Also One of The in Islam. For A Complete Disambiguation Page, See
5 pages
Submit Homework Email
100% (1)
Submit Homework Email
9 pages
Talking About The Future: - When We Know About The Future We Normally Use The Present Tense. Scheduled
No ratings yet
Talking About The Future: - When We Know About The Future We Normally Use The Present Tense. Scheduled
6 pages
improve-your-critical-thinking-british-english-teacher
No ratings yet
improve-your-critical-thinking-british-english-teacher
12 pages
GSM Based Automated Meter Reading With Bill Payment Facility
No ratings yet
GSM Based Automated Meter Reading With Bill Payment Facility
3 pages
Indra in The Rig Veda
100% (1)
Indra in The Rig Veda
93 pages
BIR RElief
67% (6)
BIR RElief
32 pages
2-1 Supple Registration Data
No ratings yet
2-1 Supple Registration Data
99 pages
Number Representation & Iseethec : Computer Science 61C Spring 2019 Weaver
No ratings yet
Number Representation & Iseethec : Computer Science 61C Spring 2019 Weaver
58 pages
Positioning Theory in Applied Linguistics: Research Design and Applications Hayriye Kayı-Aydar - The ebook in PDF/DOCX format is available for instant download
100% (3)
Positioning Theory in Applied Linguistics: Research Design and Applications Hayriye Kayı-Aydar - The ebook in PDF/DOCX format is available for instant download
62 pages
Márk
No ratings yet
Márk
3 pages
10 CÂU TN VÀ 10 CÂU TL VỀ MODAL VERB HAVE PP
No ratings yet
10 CÂU TN VÀ 10 CÂU TL VỀ MODAL VERB HAVE PP
3 pages
Activity 3 GUW
No ratings yet
Activity 3 GUW
6 pages
Past Simple and Past Continuous: Grammar
No ratings yet
Past Simple and Past Continuous: Grammar
3 pages
COS1511 January Examination 2023
No ratings yet
COS1511 January Examination 2023
8 pages
Vadic Math
No ratings yet
Vadic Math
22 pages
Popușoi Alexandru Grupa: B-1841: Lucrare de Laborator NR 2 Tema: Operații Criptografice, Protocoale Criptografice
No ratings yet
Popușoi Alexandru Grupa: B-1841: Lucrare de Laborator NR 2 Tema: Operații Criptografice, Protocoale Criptografice
7 pages
Answer Sheet - English 10 Q3 - W1
No ratings yet
Answer Sheet - English 10 Q3 - W1
6 pages
Understanding English Form 1&2
100% (1)
Understanding English Form 1&2
83 pages
LESSON GUIDE ON CATCH UP FRIDAYS February 9 GRADE 4
100% (1)
LESSON GUIDE ON CATCH UP FRIDAYS February 9 GRADE 4
4 pages
Channel Choice
No ratings yet
Channel Choice
4 pages
Arabi02058 الترجمة الأدبية بين النظرية والتطبيق
No ratings yet
Arabi02058 الترجمة الأدبية بين النظرية والتطبيق
176 pages
Anonymous
No ratings yet
Anonymous
5 pages
Cordova Tutorial PDF
No ratings yet
Cordova Tutorial PDF
101 pages
Manuscript Edited Pa
No ratings yet
Manuscript Edited Pa
46 pages
Action Plan Reading Enrichment
No ratings yet
Action Plan Reading Enrichment
2 pages

AAM UT-1 QB ANS

Uploaded by

AAM UT-1 QB ANS

Uploaded by

AAM UT-1 QB ANS

Q.1) What is feature scaling?

Feature Scaling is a technique to standardize the independent features present in

It is performed during the data pre-processing to handle highly varying magnitudes

Q.2) State any four applications of random forest.

 Marketing: Marketing trends can be identified using this algorithm.

 The most preferred value for K is 5.

The two methods used are:

 Cross-validation: Use techniques like k-fold cross-validation to determine

1. Effective in high-dimensional spaces: SVM is capable of handling data with a large

3. Robust to overfitting: Especially in high-dimensional spaces, SVM is less prone to

There are 2 types of SVM given below,

Q.6) Enlist any FOUR decision tree terminology.

Branch/Sub Tree: A tree formed by splitting the tree.

 Easy to use: SVM is simple to implement with available tools.

 Versatile: Can be used for both classification and predicting continuous

Q.1) Describe process of feature engineering.

Feature Engineering is the process of creating new features or transforming

The process of feature engineering is as given below:

 Feature Extraction: Identify relevant variables from raw data (e.g.,

 Feature Transformation: Modify features using scaling (Normalization,

 Feature Creation: Generate new meaningful features, such as time-based

 Handling Missing Values: Impute missing data using mean, median, or

 Feature Reduction: Reduce dimensionality using PCA (Principal

It is a tree-structured classifier, where internal nodes represent the features of a

Steps to build a Decision Tree:

1. Calculate the Entropy for the dataset:

2. Compute Information Gain for each feature:

3. Repeat the splitting process:

o Compute entropy and IG for remaining features.

P (A ∣ B) = Is the conditional probability of event A occurring, given that B is true.

P (B ∣ A) = Is the conditional probability of event B occurring, given that A is true.

E1 = bag first is chosen, E2 = bag second is chosen, E3 = bag third is chosen,

Each bag is equally likely to be chosen:

Step 1: Probabilities of Drawing a Red Ball from Each Bag:

Step 2: Find P (A) (Total Probability of Drawing Red):

Step 3: Apply Bayes’ Theorem

Ensemble learning is a machine learning technique where multiple models (weak

Instead of relying on a single model, ensemble methods aggregate the predictions

Types of Ensemble Learning:

1. Bagging (Bootstrap Aggregating):

 Multiple models are trained on different subsets of the dataset using

4. Voting & Averaging:

 Aggregates predictions from multiple models using majority voting (for

Step 1: Frequency Table (Counts for Weather & Play):

Step 1: Apply Bayes’ Theorem

Since (which is higher probability), players are

Conclusion: The statement is likely correct, but not always certain.

You might also like