0% found this document useful (0 votes)
2 views

Data Mining Algorithms MCQs

The document contains a series of multiple-choice questions (MCQs) focused on data mining algorithms, particularly association rules, classification, and clustering methods. Each question is followed by the correct answer, covering topics such as the purpose of association rules, characteristics of classification models, and various clustering algorithms. The content serves as a practice tool for understanding key concepts in data mining.

Uploaded by

abdatadalacha5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Data Mining Algorithms MCQs

The document contains a series of multiple-choice questions (MCQs) focused on data mining algorithms, particularly association rules, classification, and clustering methods. Each question is followed by the correct answer, covering topics such as the purpose of association rules, characteristics of classification models, and various clustering algorithms. The content serves as a practice tool for understanding key concepts in data mining.

Uploaded by

abdatadalacha5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Data Mining Algorithms - MCQ Practice

Questions
1. 1. What is the main purpose of association rules in data mining?

 A. To create database indexes


 B. To detect noise in data
 C. To identify relationships between data items
 D. To sort data alphabetically

✅ Answer: C

2. 2. Which of the following is NOT a typical use case of association rules?

 A. Diagnosing medical conditions


 B. Optimizing website interfaces
 C. Designing better sales strategies
 D. Encrypting data

✅ Answer: D

3. 3. In association rule mining, what does the "antecedent" refer to?

 A. The item found with the consequent


 B. The item that appears first in a transaction
 C. The item that leads to the consequent (the “if” part)
 D. The frequency of itemset occurrence

✅ Answer: C

4. 4. What is the measure called that shows how frequently items appear together in the
dataset?

 A. Confidence
 B. Support
 C. Lift
 D. Strength

✅ Answer: B

5. 5. A rule appears frequently in the dataset but rarely holds true when applied. What
does this indicate?
 A. High support, high confidence
 B. Low support, low confidence
 C. High support, low confidence
 D. Low support, high confidence

✅ Answer: C

6. 6. What does a lift value of 1 indicate in association rule mining?

 A. Negative correlation
 B. Positive correlation
 C. No correlation
 D. High support

✅ Answer: C

7. 7. Which algorithm generates candidate itemsets by joining large itemsets from the
previous pass with themselves?

 A. AIS
 B. SETM
 C. Apriori
 D. K-means

✅ Answer: C

8. 8. Which algorithm saves the transaction ID with each candidate itemset?

 A. AIS
 B. Apriori
 C. Decision Tree
 D. SETM

✅ Answer: D

9. 9. Why is the Apriori algorithm more efficient than AIS and SETM?

 A. It uses fewer variables


 B. It only explores itemsets that meet minimum support
 C. It doesn't scan the database
 D. It ignores confidence values

✅ Answer: B

10. 10. In which business application is market basket analysis commonly used?
 A. Employee training
 B. Product recommendation
 C. Supply chain management
 D. Retail sales

✅ Answer: D

11. 11. In the diaper and beer example, what percentage of transactions included both
items?

 A. 2.75%
 B. 2%
 C. 1.75%
 D. 0.5%

✅ Answer: C

12. 12. What does a high confidence and low support in an association rule imply?

 A. The rule is common but often incorrect


 B. The rule is rare but often correct
 C. The rule is common and always correct
 D. The rule is never useful

✅ Answer: B

13. 13. Which of the following is the correct form of an association rule?

 A. if [condition] then [result]


 B. for [every] do [action]
 C. select * from [database]
 D. input [x] to get [y]

✅ Answer: A

14. 14. What is the first step in building a classification model?

 A. Testing with unlabeled data


 B. Using normalization techniques
 C. Creating a classifier with labeled training data
 D. Drawing a pie chart

✅ Answer: C

15. 15. Which of the following best describes classification?


 A. Predicting numerical outcomes
 B. Removing duplicate data entries
 C. Assigning a category label to new observations
 D. Encrypting sensitive information

✅ Answer: C

16. 16. What is the output of a classification model?

 A. A numerical estimate
 B. A set of decision rules
 C. A category label
 D. A list of all itemsets

✅ Answer: C

17. 17. What is the major difference between classification and prediction?

 A. Classification uses regression; prediction does not


 B. Classification is unsupervised; prediction is supervised
 C. Classification predicts categories; prediction estimates numerical values
 D. Prediction uses only text data

✅ Answer: C

18. 18. Which of the following is an example of a classification algorithm?

 A. Linear Regression
 B. DBSCAN
 C. K-Means
 D. Naive Bayes

✅ Answer: D

19. 19. In data preparation, which method helps scale values into a small specified range?

 A. Generalization
 B. Noise Reduction
 C. Normalization
 D. Aggregation

✅ Answer: C

20. 20. Which algorithm is best suited for predicting continuous values?
 A. Logistic Regression
 B. K-Nearest Neighbors
 C. Linear Regression
 D. Decision Tree Classification

✅ Answer: C

21. 21. Which is *not* a stage in the data classification lifecycle?

 A. Storage
 B. Forecasting
 C. Sharing
 D. Publication

✅ Answer: B

22. 22. Which of the following algorithms is used in both classification and prediction?

 A. Logistic Regression
 B. Apriori
 C. DBSCAN
 D. K-Means

✅ Answer: A

23. 23. What does a decision tree output when used for classification?

 A. A list of association rules


 B. A numerical score
 C. A continuous value
 D. A categorical decision based on branches

✅ Answer: D

24. 24. What technique is used to identify whether two attributes are related?

 A. Clustering
 B. Correlation analysis
 C. Filtering
 D. Regression

✅ Answer: B

25. 25. What is the goal of relevance analysis in data preparation?


 A. Create predictive models
 B. Remove noise
 C. Identify useful attributes
 D. Encode categorical data

✅ Answer: C

26. 26. What fundamental principle does the Naive Bayes algorithm rely on?

 A. Decision boundaries
 B. Baye’s theorem
 C. Distance measures
 D. Gaussian distribution

✅ Answer: B

27. 27. What type of learning method is clustering considered?

 A. Supervised
 B. Reinforcement
 C. Unsupervised
 D. Semi-supervised

✅ Answer: C

28. 28. In a shopping mall, grouping similar items together like t-shirts or vegetables is an
example of:

 A. Supervised learning
 B. Data labeling
 C. Clustering
 D. Classification

✅ Answer: C

29. 29. Which of the following is an application of clustering?

 A. Data encryption
 B. Market segmentation
 C. Linear regression
 D. Data normalization

✅ Answer: B
30. 30. Which clustering method does NOT require the number of clusters to be specified in
advance?

 A. K-Means
 B. Partitioning Clustering
 C. Hierarchical Clustering
 D. Fuzzy Clustering

✅ Answer: C

31. 31. Which clustering method allows data points to belong to more than one cluster?

 A. Hard clustering
 B. K-Means
 C. Agglomerative Hierarchical
 D. Fuzzy Clustering

✅ Answer: D

32. 32. Which algorithm is an example of a density-based clustering model?

 A. DBSCAN
 B. K-Means
 C. Naive Bayes
 D. Decision Tree

✅ Answer: A

33. 33. What is a limitation of density-based clustering methods like DBSCAN?

 A. Requires pre-labeled data


 B. Can't handle outliers
 C. Difficulty with varying densities and high dimensions
 D. Needs class labels

✅ Answer: C

34. 34. What type of model does the Expectation-Maximization (EM) algorithm use in
clustering?

 A. Centroid-based
 B. Rule-based
 C. Distribution-based
 D. Tree-based
✅ Answer: C

35. 35. What type of clustering algorithm is K-Means?

 A. Density-based
 B. Distribution-based
 C. Partitioning
 D. Hierarchical

✅ Answer: C

36. 36. Which algorithm performs bottom-up hierarchical clustering?

 A. K-Means
 B. DBSCAN
 C. Agglomerative Hierarchical
 D. Expectation-Maximization

✅ Answer: C

37. 37. Which algorithm avoids the need to define the number of clusters and works by
message passing?

 A. K-Means
 B. Affinity Propagation
 C. DBSCAN
 D. Mean-shift

✅ Answer: B

38. 38. What is the most common use of clustering in biology?

 A. Labeling DNA sequences


 B. Identifying cancerous cells
 C. Sorting email messages
 D. Constructing protein molecules

✅ Answer: B

39. 39. In clustering, what does “soft clustering” mean?

 A. Data is split randomly


 B. Each data point belongs to exactly one group
 C. Each data point may belong to multiple groups
 D. Clusters are based on soft data types
✅ Answer: C

40. 40. What does the Mean-Shift algorithm do?

 A. Sorts data into fixed clusters


 B. Finds dense regions and updates centroids
 C. Uses decision trees for classification
 D. Labels data using Naive Bayes

✅ Answer: B

You might also like