Dataware Housing and Data Mining Question
Dataware Housing and Data Mining Question
R 13
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY, HYDERABAD
B. Tech IV Year I Semester
Data Warehousing and Data Mining
(Computer Science and Engineering)
Time: 3 hours Max Marks: 75
Note: This question paper contains two parts A and B. Part A are compulsory which carries 25 marks.
Answer all questions in Part A. Part B consists of 5 Units. Answer any one full question from each unit.
Each question carries 10 marks and may have a, b, c as sub questions.
MODEL PAPER - 1
PART-A(Answer all the Questions)
1.a) Write the differences between data warehousing and data mining.(3M)
b) Define multi dimensional data mining.(2M)
c) State the various views of data warehouse design?(3M)
d) Name the steps involved in data mining?(2M)
e) Name the pruning strategies in mining closed frequent item sets?(3M)
f) List the applications of pattern mining?.(2M)
g) Differentiate the supervised and unsupervised learning(2M)
h) Write short notes on the back propagation algorithm? (3M)
State the applications of clustering.(3M)
j) Explain briefly about the grid based method (2M)
2.Write the differences between operational databases and data warehousing? (10M)
OR
3.Explain in detail about the evolution of database technology.(10M)
4.Discuss briefly about multi dimensional data models?(10M)
OR
5.State and explain the methods used for efficient data cube computation(6M)
6.Discuss the FP-Growth algorithm with an example.(10M)
OR
7.Explain how to mine the multidimensional association rules from relational databases and data
warehouses?
8. Discuss in detail about the decision tree induction algorithm.(10M)
OR
9.Write in detail about the k-nearest neighbor classifier and case-based reasoning?
10. Define and explain the two hierarchical clustering methods: BIRCH and CHAMELON.(10M)
OR
11Explain about
a)Statistical-based outlier detection.(5M)
b)Distance-based outlier detection. (5M)
Code No: XXXXX
R 13
JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY, HYDERABAD
B. Tech IV Year I Semester
Data Warehousing and Data Mining
(Computer Science and Engineering)
Time: 3 hours Max Marks: 75
Note: This question paper contains two parts A and B. Part A are compulsory which carries 25 marks.
Answer all questions in Part A. Part B consists of 5 Units. Answer any one full question from each unit.
Each question carries 10 marks and may have a, b, c as sub questions.
MODEL PAPER - 2
PART-A(Answer all the Questions)
1.a) Write a short notes about the issues in data mining.(3M)
b) Define characterization and discrimination.(2M)
c) Explain in short about the virtual data warehouse.(3M)
d) Define concept hierarchy. Explain the types of concept hierarchies.(2M)
e) Define closed item set and maximal frequent item set.(3M)
f) What is an association? Write a short notes about association rule mining.(2M)
g) Define regression analysis..(2M)
h) Write a short notes about the attribute selection measures.(3M)
i)Write neatly about the data types used in cluster analysis.(3M)
j) Write about the density based clustering.(2M)
PART-B
2.Define data mining and explain in detail about the data warehouse architecture with a neat diagram.(10M)
OR
3.What are the primitives that specify the data mining task? Explain in detail about the data smoothing
techniques.(10M)
4.Write neatly about different schemas used in multi dimensional data mining with an example for
each.(10M)
OR
5.Define ROLAP, MOLAP, and HOLAP. Explain in detail about the efficient methods of data cube
computation. (10M)
8. What measures are used to find best split in Decision Tree Induction algorithm? How Can we improve
the scalability in Decision Tree Induction algorithm?(10M)
OR
9.Describe the working procedures of simple Bayesian classifier. Discuss the Back propagation algorithm.
MODEL PAPER - 3
PART-A(Answer all the Questions)
1) a)What are the advantages of data warehouse?(3M)
b)Define OLAP. (2M)
c)List the major issues in data mining. (3M)
d)List the reasons for using data mining? (2M)
e)List the reasons for using data mining? (3M)
f)Define FP-tree? (2M)
g)What is Hunt’s Algorithm?(2M)
h)What is holdout technique? (3M)
i)List the requirements of clustering.(3M)
j) Write in brief about index based algorithms (2M)
PART-B
2.Explain the steps for designing and constructing data warehouse? (10M)
OR
3.What is data mining? List and describing the motivating challenges of data mining.(10M)
MODEL PAPER - 4
PART-A(Answer all the Questions)
1 a)Compare OLAP system versus statistical databases?(3M)
b)What are the differences between fact and dimension tables?(2M)
c)Write a brief note on data discretization? (3M)
d)Discuss briefly about similarities between data objects? (2M)
e)Describe Brute-forces method to generate candidate’s item sets? (3M)
f)Discuss briefly the monotonicity property. (2M)
g)List the advantages of information gain?(2M)
h)List the weaknesses of k-means? (3M)
i)Write short note on density based outlier detection?(3M)
j) Write short notes on density based outlier detection.?(2M)
PART-B
2.Draw and explain the three tier data warehouse architecture? (10M)
OR
3.Illustrate and explain the OLAP architecture.(10M)
MODEL PAPER - 5
PART-A(Answer all the Questions)
1. a) Discuss the characteristics of fact table(3M)
b)What are the tools used for designing data warehouse?(2M)
c)What is data cleaning? (3M)
d)Write the algorithm for discrete wavelet transform (DWT)? (2M)
e)Define support and confidence? (3M)
f)What are the drawbacks of FP growth algorithm ? (2M)
g)What is meant by classification? What are the applications of classification model? (2M)
h)List the advantage of Bayesian classification? (3M)
i)Make a comparison of complete and partial clusters?(3M)
j) Discuss the times and space complexity of K-means.?(2M)
PART-B
6. With an example, explain the frequent item set generation in the Apriori algorithm.(10M)
OR
7.Explain the partition algorithm with an example. (10M)
MODEL PAPER - 6
PART-A(Answer all the Questions)
OR
5.Explain Data Mining Task primitives.
8.Write a prototype for data mining application for Insurance DWH. (10M)
OR
9.Write about mining the web link structures to identify authoritative web pages.
10.Explain about Spatial data cube construction and spatial OLAP. (10M)
OR
11. Explain the Class Composition Hierarchies.
Code No: R15A0526 R15
MALLA REDDY COLLEGE OF ENGINEERING & TECHNOLOGY
(Autonomous Institution – UGC, Govt. of India)
IV B. Tech I Semester Supplementary Examinations, May 2019
Data Warehousing and Data Mining
(CSE)
Roll No
SECTION-I
OR
OR
SECTION-III
OR
7 Differentiate between Maximal Frequent Item Set and Closed Frequent Item Set [10M]
SECTION-IV
OR
SECTION-V
OR
******