0% found this document useful (0 votes)
23 views

DWDM_Mid-1

This document is a mid-term question bank for the Data Warehousing and Data Mining course at V.V.P. Engineering College, covering five units. Topics include data warehousing concepts, data mining processes, data preprocessing techniques, frequent patterns and associations, and classification methods. Each unit contains specific questions designed to test students' understanding of the subject matter.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

DWDM_Mid-1

This document is a mid-term question bank for the Data Warehousing and Data Mining course at V.V.P. Engineering College, covering five units. Topics include data warehousing concepts, data mining processes, data preprocessing techniques, frequent patterns and associations, and classification methods. Each unit contains specific questions designed to test students' understanding of the subject matter.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Semester 6th – I.T. Department – V.V.P. Engg.

College
DATAWAREHOUSING AND DATA MINING
Subject Code: 3161610
Mid-1 Unit-Wise Question Bank

Unit-1 Data Warehousing


1) What is Data Warehousing? Explain its features.
2) Difference between a) Data warehouse and Data Mart b) OLTP and OLAP systems
c) Fact table vs. Dimension table.
3) With the help of a neat diagram, explain the 3-tier architecture of a data warehouse.
4) Explain Star, Snowflake, and Fact Constellation Schema for Multidimensional
Database with diagram.
5) What is Cube? Explain various OLAP Operations on Data Cube with example.

Unit-2 Introduction to data mining (DM)


6) Define the term “Data Mining”. With the help of a suitable diagram, explain the
process of knowledge discovery from databases. Why is it called data mining rather
than knowledge mining?
7) List the types of data on which data mining can be performed. Explain different data
mining functionalities.
8) Write a note on Classification of data mining.
9) Discuss possible ways for integration of a Data Mining system with a Database or
Data Warehouse system.
10) List and describe major issues in data mining.

Unit-3 Data Preprocessing


11) Explain the pre-processing required to handle missing data and noisy data during
the process of data mining. Or List and describe the methods for handling the
missing and noisy values in data cleaning.
12) Suppose that the data for analysis includes the attribute age. The age values for the
data tuples are (in increasing order): 13, 15, 16, 16, 19, 20, 23, 29, 35, 41, 44, 53, 62,
69, 72
a. Use min-max normalization to transform the value 45 for age onto the range
[0:0, 1:0]
b. Use z-score normalization to transform the value 45 for age, where the
standard deviation of age is 20.64 years.
13) What is noise? Explain data smoothing methods as a noise removal technique to
divide the given data into bins of size 3 by bin partition (equal frequency), by bin
means, by bin medians, and by bin boundaries. Consider the data: 10, 2, 19, 18, 20,
18, 25, 28, 22
Compiled By: Darshana H. Patel
V.V.P. Engineering College, Rajkot
Unit-4 Mining Frequent Patterns, Associations and Correlations
14) Write and discuss the algorithm which is used to generate frequent itemsets using
an iterative level-wise approach based on candidate generation. State the Apriori
Property. Also, list the technique to improve the efficiency of Apriori algorithm.
Generate large itemsets and association rules using Apriori algorithm on the
following data set with a minimum support value and a minimum confidence
value set as 50% and 75% respectively

Unit-5 Classification and Prediction


15) Explain the Classification by Decision Tree Induction Algorithm, illustrating an
example of the algorithm
16) What is classification and prediction? List out the Issues regarding Classification and
prediction.
17) Discuss Tree Pruning in detail. Or why is tree pruning useful in decision tree
induction?
18) What is an attribute selection measure? Explain different attribute selection measures
with an example. OR Explain the following as attribute selection measure: (i)
Information Gain (ii) Gain Ratio
19) Explain “Linear Regression” using a suitable example. Or Explain Linear & Non-
Linear Regression methods of Predictions. Or Explain linear regression? What are the
reasons for not using the linear regression model to estimate the output data?
20) Why is naïve Bayesian classification called “naïve”? Briefly outline the major ideas of
naïve Bayesian classification, giving an example. Or Explain Bayes’ Theorem and a
statistical-based algorithm used for classification.
21) a) Explain how the accuracy of a classifier/predictor can be measured (or evaluating
the accuracy of a classifier/predictor)
b)Describe by which methods accuracy can be increased (Ensemble
methods/Combining methods).
22) Write a note on accuracy and error measures for classification and prediction
23) Explain rule-based classification and case-based reasoning in detail.
24) What are neural networks? Describe the various factors which make them useful for
classification and prediction in data mining. Explain how the topology of the neural
network is designed. List the strengths and weaknesses of a neural network as a
classifier. What are the terminating conditions to stop the training process of the
neural network classifier?
Compiled By: Darshana H. Patel
V.V.P. Engineering College, Rajkot
Compiled By: Darshana H. Patel
V.V.P. Engineering College, Rajkot

You might also like