This document is a mid-term question bank for the Data Warehousing and Data Mining course at V.V.P. Engineering College, covering five units. Topics include data warehousing concepts, data mining processes, data preprocessing techniques, frequent patterns and associations, and classification methods. Each unit contains specific questions designed to test students' understanding of the subject matter.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
23 views
DWDM_Mid-1
This document is a mid-term question bank for the Data Warehousing and Data Mining course at V.V.P. Engineering College, covering five units. Topics include data warehousing concepts, data mining processes, data preprocessing techniques, frequent patterns and associations, and classification methods. Each unit contains specific questions designed to test students' understanding of the subject matter.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Semester 6th – I.T. Department – V.V.P. Engg.
College DATAWAREHOUSING AND DATA MINING Subject Code: 3161610 Mid-1 Unit-Wise Question Bank
Unit-1 Data Warehousing
1) What is Data Warehousing? Explain its features. 2) Difference between a) Data warehouse and Data Mart b) OLTP and OLAP systems c) Fact table vs. Dimension table. 3) With the help of a neat diagram, explain the 3-tier architecture of a data warehouse. 4) Explain Star, Snowflake, and Fact Constellation Schema for Multidimensional Database with diagram. 5) What is Cube? Explain various OLAP Operations on Data Cube with example.
Unit-2 Introduction to data mining (DM)
6) Define the term “Data Mining”. With the help of a suitable diagram, explain the process of knowledge discovery from databases. Why is it called data mining rather than knowledge mining? 7) List the types of data on which data mining can be performed. Explain different data mining functionalities. 8) Write a note on Classification of data mining. 9) Discuss possible ways for integration of a Data Mining system with a Database or Data Warehouse system. 10) List and describe major issues in data mining.
Unit-3 Data Preprocessing
11) Explain the pre-processing required to handle missing data and noisy data during the process of data mining. Or List and describe the methods for handling the missing and noisy values in data cleaning. 12) Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order): 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62, 69, 72 a. Use min-max normalization to transform the value 45 for age onto the range [0:0, 1:0] b. Use z-score normalization to transform the value 45 for age, where the standard deviation of age is 20.64 years. 13) What is noise? Explain data smoothing methods as a noise removal technique to divide the given data into bins of size 3 by bin partition (equal frequency), by bin means, by bin medians, and by bin boundaries. Consider the data: 10, 2, 19, 18, 20, 18, 25, 28, 22 Compiled By: Darshana H. Patel V.V.P. Engineering College, Rajkot Unit-4 Mining Frequent Patterns, Associations and Correlations 14) Write and discuss the algorithm which is used to generate frequent itemsets using an iterative level-wise approach based on candidate generation. State the Apriori Property. Also, list the technique to improve the efficiency of Apriori algorithm. Generate large itemsets and association rules using Apriori algorithm on the following data set with a minimum support value and a minimum confidence value set as 50% and 75% respectively
Unit-5 Classification and Prediction
15) Explain the Classification by Decision Tree Induction Algorithm, illustrating an example of the algorithm 16) What is classification and prediction? List out the Issues regarding Classification and prediction. 17) Discuss Tree Pruning in detail. Or why is tree pruning useful in decision tree induction? 18) What is an attribute selection measure? Explain different attribute selection measures with an example. OR Explain the following as attribute selection measure: (i) Information Gain (ii) Gain Ratio 19) Explain “Linear Regression” using a suitable example. Or Explain Linear & Non- Linear Regression methods of Predictions. Or Explain linear regression? What are the reasons for not using the linear regression model to estimate the output data? 20) Why is naïve Bayesian classification called “naïve”? Briefly outline the major ideas of naïve Bayesian classification, giving an example. Or Explain Bayes’ Theorem and a statistical-based algorithm used for classification. 21) a) Explain how the accuracy of a classifier/predictor can be measured (or evaluating the accuracy of a classifier/predictor) b)Describe by which methods accuracy can be increased (Ensemble methods/Combining methods). 22) Write a note on accuracy and error measures for classification and prediction 23) Explain rule-based classification and case-based reasoning in detail. 24) What are neural networks? Describe the various factors which make them useful for classification and prediction in data mining. Explain how the topology of the neural network is designed. List the strengths and weaknesses of a neural network as a classifier. What are the terminating conditions to stop the training process of the neural network classifier? Compiled By: Darshana H. Patel V.V.P. Engineering College, Rajkot Compiled By: Darshana H. Patel V.V.P. Engineering College, Rajkot
Buy ebook (Ebook) Tropical Pacific Island Environments (2nd Ed.) by Christopher S. Lobban, Maria Schefter, Frank Camacho, John Jocson ISBN 9781573064620, 1573064629 cheap price