0% found this document useful (0 votes)
13 views

ML-Objectives-Mid-1

The document contains a series of questions and fill-in-the-blank statements related to machine learning concepts, algorithms, and methodologies. It covers topics such as supervised and unsupervised learning, classification and regression algorithms, evaluation techniques, and specific algorithms like K-Nearest Neighbors, Support Vector Machines, and Random Forest. The content is designed to test knowledge and understanding of key machine learning principles.

Uploaded by

nithin74728
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

ML-Objectives-Mid-1

The document contains a series of questions and fill-in-the-blank statements related to machine learning concepts, algorithms, and methodologies. It covers topics such as supervised and unsupervised learning, classification and regression algorithms, evaluation techniques, and specific algorithms like K-Nearest Neighbors, Support Vector Machines, and Random Forest. The content is designed to test knowledge and understanding of key machine learning principles.

Uploaded by

nithin74728
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

1.

K-Nearest Neighbors (KNN) is classified as what type of machine learning


algorithm?
a) Instance-based learning
b) Parametric learning
c) Non-parametric learning
d) Model-based learning
2. Which of the following is not a supervised machine learning algorithm?
a) K-means
b) Naïve Bayes
c) SVM for classification problems
d) Decision tree
3. Which algorithm is best suited for a binary classification problem?
a) K-nearest Neighbors
b) Decision Trees
c) Random Forest
d) Linear Regression
4. What is the key difference between supervised and unsupervised learning?
a) Supervised learning requires labeled data, while unsupervised learning does not.
b) Supervised learning predicts labels, while unsupervised learning discovers patterns.
c) Supervised learning is used for classification, while unsupervised learning is used
for regression.
d) Supervised learning is always more accurate than unsupervised learning.
5. In supervised learning, the training dataset consists of:
a) Input features only
b) Output labels only
c) Input features and output labels
d) None of the above
6. Which supervised learning algorithm is known for its ability to handle both
classification and regression tasks?
a) Support Vector Machines (SVM)
b) Random Forest
c) K-nearest neighbors (KNN)
d) Linear Regression
7. The main objective of a classification algorithm in supervised learning is to:
a) Predict continuous values
b) Determine the optimal number of clusters
c) Assign input data to predefined categories or classes
d) Identify patterns in unlabeled data
8. Which algorithm is used to minimize the errors between predicted and actual outputs
in supervised learning?
a) Decision tree
b) Gradient Boosting
c) K-means clustering
d) Principal Component Analysis (PCA)
9. Which algorithm is prone to overfitting in supervised learning?
a) Logistic Regression
b) Support Vector Machines (SVM)
c) K-means clustering
d) Linear Regression
10. Which supervised learning algorithm is an ensemble method that combines multiple
weak learners to make predictions?
a) K-means clustering
b) Random Forest
c) Naive Bayes
d) K-nearest neighbors (KNN)
11. A model performs well on the training data but poorly on new, unseen data,
indicating:
a) Under-fitting
b) Over-fitting
c) Validated
d) Optimal Balanced
12. Which of the following is considered in designing a machine learning system?
a) Choosing Training experience
b) Function approximation algorithm
c) Choosing Target Function
d) All the above
13. Identify the type of unsupervised learning algorithm.
a) Naïve Bayes Classifier
b) Linear Regression
c) Decision Tree algorithm
d) K-Means Clustering algorithm
14. What is the bias-variance tradeoff in machine learning?
a) A concept related to feature selection.
b) Finding the right balance between bias and variance in a model.
c) The tradeoff between model complexity and computational cost.
d) The tradeoff between precision and recall.
15. In SVM, what is the kernel trick used for?
a) To increase the bias of the model
b) To transform data into a higher-dimensional space
c) To reduce the number of support vectors
d) To replace decision trees with kernels
16. Which algorithm is used to determine whether an employee will get a promotion
based on their performance?
a) K-Means Clustering
b) Logistic Regression
c) DBSCAN algorithm
d) KNN algorithm
17. Support Vector Machine is:
a) Logical model
b) Probabilistic model
c) Geometric model
d) None of the above
18. In a simple linear regression model (one independent variable), if we change the input
variable by 1 unit, how much will the output variable change?
a) By 1
b) No change
c) By intercept
d) By its slope
19. What is the fundamental idea behind the Random Forest model?
a) It constructs multiple decision trees and combines their predictions.
b) It uses a single decision tree for classification.
c) It employs a linear discriminant function.
d) It performs k-Nearest Neighbors classification.
20. Which of the following clustering algorithms follows a top-to-bottom approach?
a) K-means
b) Divisible
c) Agglomerative
d) None
21. What are the typical steps in the machine learning process?
a) Data collection, data preprocessing, feature engineering, model selection, training,
evaluation, and deployment.
b) Data collection, Data analysis, feature extraction, model validation, and testing.
c) Data preprocessing, model selection, and deployment.
d) Data collection, model training, and testing.
22. What is the name of the diagram that represents the tree structure of hierarchical
clustering?
a) Cluster plot
b) Decision tree
c) Dendrogram
d) Scatter plot
23. In the context of linear classification, what is a discriminant function?
a) A function that discriminates against certain data points
b) A function that transforms data to a higher-dimensional space
c) A function that defines a decision boundary between classes
d) A function that adds noise to the data
24. In which category does linear regression belong?
a) Neither supervised nor unsupervised learning
b) Both supervised and unsupervised learning
c) Unsupervised learning
d) Supervised learning
25. The learner is trying to predict housing prices based on the size of each house. What
type of regression is this?
a) Multivariate Logistic Regression
b) Logistic Regression
c) Linear Regression
d) Multivariate Linear Regression
26. How many variables are required to represent a linear regression model?
a) 3
b) 2
c) 1
d) 4
27. The cost function for logistic regression and linear regression are the same.
a) True
b) False
28. Which of the following statements is not true about the Decision Tree?
a) It can be applied to binary classification problems only.
b) It is a predictor that predicts the label associated with an instance by traveling from
a root node of a tree to a leaf.
c) At each node, the successor child is chosen based on a splitting of the input space.
d) The splitting is based on one of the features or a predefined set of splitting rules.
29. Which is not true about clustering?
a) A collection of objects based on similarity and dissimilarity between them.
b) Dividing the population or data points into a number of clusters.
c) An unsupervised learning method.
d) Identifies the category of new observations based on training data.
30. Which is conclusively produced by Hierarchical Clustering?
a) Final estimation of cluster centroids
b) Tree showing how nearby things are to each other
c) Assignment of each point to clusters
d) All of these
31. Which of the following is a good technique to evaluate the performance of a machine
learning model?
a) Sampling
b) Parameter Tuning
c) Cross-validation
d) Stratification
32. Which of the following is a widely used and effective machine learning algorithm
based on the idea of bagging?
a) Random Forest
b) Regression
c) Classification
d) Decision Tree

Fill in the blanks

33. The learner is trying to predict housing prices based on the size of each house. The
variable “size” is ___________
34. The target variable is represented along ____________
35. The learner is trying to predict the cost of papaya based on its size. The variable
“cost” is __________
36. The independent variable is represented along _________
37. Some telecommunication company wants to segment their customers into distinct
groups; this is an example of ________________ learning.

38. ______________________ is a machine learning training method based on rewarding


desired behaviors and punishing undesired ones.
39. ___________________ function transforms the raw output scores into a probability
distribution over two classes, ensuring that the probabilities range between 0 and 1.
40. ____________________ are parameters that are set before the machine learning
model is trained and remain fixed during training.
41. ______________________________ is a widely used method for estimating the
parameters of a probability distribution from observed data.
42. ________________________ is the space of all possible values that the weights of a
machine learning model can take.

43. ______________________ is a smoothing technique that helps tackle the problem of


zero probability in the Naïve Bayes machine learning algorithm.
44. ______________________ is a table for defining the performance of a classification
algorithm.
45. ______________________________ is a bottom-up hierarchical clustering approach
where each data point starts as its own cluster and merges iteratively.
46. ____________________ is a statistical approach that represents the linear relationship
between two or more variables, either dependent or independent.
47. __________________ machine learning algorithm can be used for imputing missing
values of both categorical and continuous variables
48. _______________ is the measurement of disorder/randomness in a dataset or
impurities in the information processed in machine learning.
49. The ___________________ aims to model the conditional distribution of the output
variable given the input variable. They learn a decision boundary that separates the
different classes of the output variable.
50. ____________________ clustering aims to partition n observations into k clusters in
which each observation belongs to the cluster with the nearest centroid.
51. In ________________________________ models, we assume that the data is
generated by an underlying probability distribution.
52. _________________________ is a clustering algorithm that relies on maximizing the
likelihood to find the statistical parameters of the underlying sub-populations in the
dataset.
53. ____________ type of machine learning algorithm falls under the category of
“unsupervised learning.”
54. ______________ uses the inductive learning machine learning approach.
55. A _________ is a decision support tool that uses a tree-like graph or model of
decisions and their possible consequences, including chance event outcomes, resource
costs, and utility.
56. In a decision tree, the equation of entropy measure is
______________________________ .
57. Decision tree means ___________________________
58. Decision Nodes are represented by ____________
59. _______________ is the goal of supervised learning.
60. _____________ is an example of a classification problem.
61. Regression means _________________
62. MLR full form is ________________
63. HCA full form is ___________________

You might also like