0% found this document useful (0 votes)
37 views

unit4 mcqs

The document is a question bank for a Data Warehousing and Data Mining course, specifically focusing on Unit IV. It contains multiple-choice questions covering various topics such as the multidimensional model, learning processes, classification techniques, and algorithms related to frequent itemset mining. The questions assess knowledge on concepts like the Apriori algorithm, FP-growth, and association rule mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

unit4 mcqs

The document is a question bank for a Data Warehousing and Data Mining course, specifically focusing on Unit IV. It contains multiple-choice questions covering various topics such as the multidimensional model, learning processes, classification techniques, and algorithms related to frequent itemset mining. The questions assess knowledge on concepts like the Apriori algorithm, FP-growth, and association rule mining.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

Question Bank

Year & Semester: III & V


Subject Code & Subject Name: U14CS518 & Data Warehousing and Data Mining
Unit – IV
PART – A
1. What is true of the multidimensional model?
a. It typically requires less disk storage
b. It typically requires more disk storage
c. Typical business queries requiring aggregate functions take more time
d. Increasing the size of a dimension is difficult
2. Learning is,

a. The process of finding the right formal representation of a certain body of knowledge
in order to represent it in a knowledge-based system
b. It automatically maps an external signal space into a system's internal
representational space. They are useful in the performance of classification tasks.
c. A process where an individual learns how to carry out a certain task when making a
transition from a situation in which the task cannot be carried out to a situation in
which the same task under the same circumstances can be carried out.
d. None of these
3.The apriori property means
a. If a set cannot pass a test, all of its supersets will fail the same test as well
b. To improve the efficiency the level-wise generation of frequent item sets
c. If a set can pass a test, all of its supersets will fail the same test as well
d. To decrease the efficiency the level-wise generation of frequent item sets
4. Which is the technique used for classification in data mining?
a. Descriptive pattern c. Decision tree classifiers
b. Associations d. Regression
5. Which algorithm is used to build decision tree classifier in a given set of training instances?
a. Greedy alogorithm c. ETL algorithm
b. Bayes algorithm d. None of the above
6. ____________ deal with the prediction of value rather than a class.
a. Regression c. Recall
b. Precision d. Multiway splits
7. Which is a type of classifier that has been found to give very accurate classification across a
range of application?

a. Binary split c. Overfitting


b. Multiway split d. Support vector machine
8. Which of the following is/are frequent pattern analysis?

a. set of items b. subsequences

c. substructures d. All of the above

9. Which of the following technique is not improving efficiency of an Apriori algorithm?

a. Transaction reduction b. Partitioning

c. Hash-based itemset counting d. FP growth

10. Which of the following is an example for frequent itemset mining?

a. Market basket analysis c. Clustering


b. Cross-marketing d. All of the above
11. Mining frequent itemsets without candidate generation is called_______________.
a. Market basket analysis c. Frequent pattern growth
b. Apriori algorithm d. None of the above
1
12. In a a.dataDimension
cube the base
cuboids
cuboid called as, c. Apex cuboids
b. Dimension cuboids d. 3 -Dimension cuboids
13. Prediction is,
a. The result of the application of a theory or a rule in a specific case
b. One of several possible enters within a database table that is chosen by the designer as
the primary means of accessing the data in the table.
c. Discipline in statistics that studies ways to find the most interesting projections of multi-
dimensional spaces.
d. None of these
14. Frequent pattern growth adopts a divide-and-conquer strategy, it consists of,
a. Conditional databases & Frequent-pattern tree
b. Frequent-pattern tree & Conditional databases
c. FP-tree d. Set of conditional
databases
15. Which type of data format is adopted in Apriori and FP-growth frequent patterns,
a. Vertical data format c. Both (a) & (b)
b.Horizontal data format d.None of the above
16. Single dimensional association rule also called as_________________
a. Intradimensional association rule
b. Interdimensional association rules
c. Hybrid-dimensional association rules
d. None of the above
17. What is ARCS?
a. Association Regression Classification System
b. Association Rule Classification System
c. Association Rule Clustering System
d. Algorithm for Rule Classification System
18. When an intemset S satisfies the constraint called____________
a. Monotonic c. Succinctness
b. Anti-monotonic d. Optimization
19. The data classification process is consists of,
a. Learning & Classification c. Supervised & classification
b. Unsupervised & clustering d. Learning & Clustering
20. What are the two components of a belief network,
a. Probabilistic networks & probability tables
b. Bayesian networks & Directed acyclic graph
c. Directed acyclic graph & set of conditional probability tables
d. None of the above
21. ................is the process of finding a model that describes and distinguishes data classes or
concepts.
a. Data Characterization c. Data discrimination
b. Data Classification d. Data selection
22. What classifiers are normally considered to be easy to interpret?
a. SVM c. Decision trees
b. Linear Regression d. k-Nearest Neighbor
23. Disjoint training and test datasets are required to estimate the classification performance on . .
.
a. The training dataset c. The entire population
b. The test dataset d. None of The Above
24. The confidence of the estimate of classification performance increases with . . .
a. increasing training dataset size
b. decreasing training dataset size
c. increasing test dataset size
d. decreasing test dataset size
25. A common weakness of association rule mining is that . . .
a. it is too inefficient

2
b. it produces too many rules
c. it produces not enough interesting rules
d. it produces too many frequent itemsets
26. Which of the following are interestingness measures for association rules?
a. accuracy c. compactness
b. recall d. lift
27. The rule, age(x, ”youth”) AND income(X, low) -> class(X, B)?

a. Decision tree c. Neural network


b. If-then d. All of the above
28. If A => B = P(B/A) what is the confidence equation?
a. Support _count (AƲB) b. Support _count (A∩B)
Support count (A) Support count (B)

c. Support _count (AƲB) d. Support _count (A)


Support count (B) Support count (AƲB)

29. If the number of transaction is five and minimum support threshold is 60%, then what is the
minimum support count?

a. 2 c. 4
b. 3 d. 1

30. If the resulting value of this equation is less than 1, then what is

the occurrence of A & B?


a. Independent and there is no correlation
b. Negatively correlated c. Positively correlated
d. Dependent and there is no correlation

31. What is the hash bucket address if the values of two frequent itemset (x & y) are 2 and 3?
a. 4 b. 2 c. 0 d. 5

32. When an intemset S satisfies the constraint, so does any of its superset, sum (S.Price) ³ v is?

a. Anti Monotonicity c. Monotonicity


b. Succinctness d. Convertible
33. How do you find the midpoint between each pair of adjacent values if it considered as a
possible split point as ai.
a. (ai+ai+1)/2 c. (ai+ai+2)/4
b. (ai+ai-1)/2 d. (ai+ai+2)/4

34. IF age = youth AND student = yes THEN buys computer = yes, in this rule what is IF part
and THEN part?

a. Rule consequent & Rule condition


b. Rule precondition & Rule antecedent
c. Rule consequent & Rule antecedent
d. Rule antecedent & Rule consequent
35. How to compute the errors in propagated backward network’s prediction ?
a. Err j = Oj (1-Oj)(Tj -Oj)

b. ij = (l) ErrjOi

3
c. wi j = wi j + ij d. j = (l)Err j

36. The rule is, IF income = high THEN loan decision = accept, Each time we add an attribute
test to a rule and selecting credit rating = excellent, then what is the current rule?
a. IF income = high AND loan decision = accept THEN credit rating = excellent
b. IF loan decision = accept AND credit rating = excellent THEN income = high
c. IF income = high AND credit rating = excellent THEN loan decision = accept
d. None of the above
37. What does Apriori algorithm do?
a. It mines all frequent patterns through pruning rules with lesser support
b. It mines all frequent patterns through pruning rules with higher support
c. Both a and b
d. None of the above
38. What does FP growth algorithm do?
a. It mines all frequent patterns through pruning rules with lesser support
b. It mines all frequent patterns through pruning rules with higher support
c. It mines all frequent patterns by constructing a FP tree
d. All of the above
39. What techniques can be used to improve the efficiency of apriori algorithm?
a. Hash-based techniques
b. Transaction Redu
c. Support(A B) / Support (A)
d. Support(A B) / Support (B)
40. Which of the following is direct application of frequent itemset mining?
a. Social Network Analysis
b. Market Basket Analysis
c. Outlier Detection
d. Intrusion Detection
41. What is not true about FP growth algorithms?
a. It mines frequent itemsets without candidate generation.
b. There are chances that FP trees may not fit in the memory
c. FP trees are very expensive to build
d. It expands the original database to build FP trees.
42. When do you consider an association rule interesting?
a. If it only satisfies min_support
b. If it only satisfies min_confidence
c. If it satisfies both min_support and min_confidence
d. There are other measures to check so
43. What is the difference between absolute and relative support?
a. Absolute - Minimum support count threshold and Relative - Minimum support threshold
b. Absolute - Minimum support threshold and Relative - Minimum support count threshold
4
c. Both mean same
44. What is the relation between candidate and frequent itemsets?
a. A candidate itemset is always a frequent itemset
b. A frequent itemset must be a candidate itemset
c. No relation between the two
d. Both are same
45. Which technique finds the frequent itemsets in just two database scans?
a. Partitioning
b. Sampling
c. Hashing
d. Dynamic itemset counting
46. Which of the following is true?
a. Both apriori and FP-Growth uses horizontal data format
b. Both apriori and FP-Growth uses vertical data format
c. Apriori uses horizontal and FP-Growth uses vertical data format
d. Apriori uses vertical and FP-Growth uses horizontal data format
47. What is the principle on which Apriori algorithm work?
a. If a rule is infrequent, its specialized rules are also infrequent
b. If a rule is infrequent, its generalized rules are also infrequent
c. Both a and b
d. None of the above
48. Which of these is not a frequent pattern mining algorithm?
a. Apriori
b. FP growth
c. Decision trees
d. Eclat
49. Which algorithm requires fewer scans of data?
a. Apriori
b. FP growth
c. Both a and b
d. None of the above
50. What are closed itemsets?
a. An itemset for which at least one proper super-itemset has same support
b. An itemsetwhose no proper super-itemset has same support
c. An itemset for which at least super-itemset has same confidence
d. An itemsetwhose no proper super-itemset has same confidence
51. What are closed frequent itemsets?
a. A closed itemset
b. A frequent itemset
c. An itemset which is both closed and frequent

5
d. None of the above
52. What are maximal frequent itemsets?
a. A frequent itemsetwhose no super-itemset is frequent
b. A frequent itemset whose super-itemset is also frequent
c. A non-frequent itemset whose super-itemset is frequent
d. None of the above
53. Why is correlation analysis important?
a. To make apriori memory efficient
b. To weed out uninteresting frequent itemsets
c. To find large number of interesting itemsets
d. To restrict the number of database iterations
54. For questions given below consider the data
Transactions :
1. I1, I2, I3, I4, I5, I6
2. I7, I2, I3, I4, I5, I6
3. I1, I8, I4, I5
4. I1, I9, I10, I4, I6
5. I10, I2, I4, I11, I5
With support as 0.6 find all frequent itemsets?
a. <I1>, <I2>, <I4>, <I5>, <I6>, <I1, I4>, <I2, I4>, <I2, I5>, <I4, I5>, <I4, I6>, <I2, I4,
I5>
b. <I2>, <I4>, <I5>, <I2, I4>, <I2, I5>, <I4, I5>, <I2, I4, I5>
c. <I11>, <I4>, <I5>, <I6>, <I1, I4>, <I5, I4>, <I11, I5>, <I4, I6>, <I2, I4, I5>
d. None of above
55. What will happen if support is reduced?
a. Number of frequent itemsets remains same
b. Some itemsets will add to the current set of frequent itemsets
c. Some itemsets will become infrequent while others will become frequent
d. Can not say
56. Find all strong association rules given the support is 0.6 and confidence is 0.8.
a. <I2, I4> → I5, <I2, I5> → <I4><I5, I4> → I2
b. <I2, I4> → I5, <I2, I5> → <I4>
c. Null rule set
d. Cannot be determined
57. What is the effect of reducing min confidence criteria on the same?
a. Number of association rules remains same
b. Some association rules will add to the current set of association rules
c. Some association rules will become invalid while others might become a rule.
d. Can not say
58. Can FP growth algorithm be used if FP tree cannot be fit in memory?

6
a. Yes
b. No
c. Both a and b
d. None of the above
59. What is association rule mining?
a. Same as frequent itemset mining
b. Finding of strong association rules using frequent itemsets
c. Using association to analyse correlation rules
d. None of the above
60. What is frequent pattern growth?
a. Same as frequent itemset mining
b. Use of hashing to make discovery of frequent itemsets more efficient
c. Mining of frequent itemsets without candidate generation
d. None of the above
61. When is sub-itemset pruning done?
a. A frequent itemset 'P' is a proper subset of another frequent itemset 'Q'
b. Support (P) = Support(Q)
c. When both a and b is true
d. When a is true and b is not
62. Which of the following is not null invariant measure(that does not considers null
transactions)?
a. all_confidence
b. max_confidence
c. cosine measure
d. lift
63. The apriori algorithm works in a ..and ..fashion?
a. top-down and depth-first
b. top-down and breath-first
c. bottom-up and depth-first
d. bottom-up and breath-first

You might also like