0% found this document useful (0 votes)
205 views

Data Warehousing and Mining

This document outlines a course on Data Warehousing and Mining. The 4 credit course aims to help students understand data warehouse fundamentals, design dimensional data models, apply OLAP operations, and use data mining algorithms like classification, clustering and association rule mining. The 6 module course covers topics such as ETL processes, dimensional modeling, data exploration, classification, clustering, frequent pattern mining and spatial/web mining. Students will be assessed through two in-semester tests and a final exam consisting of short questions from each module.

Uploaded by

coastudies
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views

Data Warehousing and Mining

This document outlines a course on Data Warehousing and Mining. The 4 credit course aims to help students understand data warehouse fundamentals, design dimensional data models, apply OLAP operations, and use data mining algorithms like classification, clustering and association rule mining. The 6 module course covers topics such as ETL processes, dimensional modeling, data exploration, classification, clustering, frequent pattern mining and spatial/web mining. Students will be assessed through two in-semester tests and a final exam consisting of short questions from each module.

Uploaded by

coastudies
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Course Code Course Name Credits

CSC603 Data Warehousing and Mining 4

Course objectives:
1. To identify the scope and essentiality of Data Warehousing and Mining.
2. To analyze data, choose relevant models and algorithms for respective applications.
3. To study spatial and web data mining.
4. To develop research interest towards advances in data mining.

Course outcomes: On successful completion of course learner will be able to:


1. Understand Data Warehouse fundamentals, Data Mining Principles
2. Design data warehouse with dimensional modelling and apply OLAP operations.
3. Identify appropriate data mining algorithms to solve real world problems
4. Compare and evaluate different data mining techniques like classification, prediction, clustering
and association rule mining
5. Describe complex data types with respect to spatial and web mining.
6. Benefit the user experiences towards research and innovation.

Prerequisite: Basic database concepts, Concepts of algorithm design and analysis.

Module
Topics Hrs.
No.
Introduction to Data Warehouse and Dimensional modelling: Introduction to
Strategic Information, Need for Strategic Information, Features of Data Warehouse,
Data warehouses versus Data Marts, Top-down versus Bottom-up approach. Data
1.0 warehouse architecture, metadata, E-R modelling versus Dimensional Modelling, 8
Information Package Diagram, STAR schema, STAR schema keys, Snowflake
Schema, Fact Constellation Schema, Factless Fact tables, Update to the dimension
tables, Aggregate fact tables.
ETL Process and OLAP: Major steps in ETL process, Data extraction:
Techniques, Data transformation: Basic tasks, Major transformation types, Data
2.0 Loading: Applying Data, OLTP Vs OLAP, OLAP definition, Dimensional 8
Analysis, Hypercubes, OLAP operations: Drill down, Roll up, Slice, Dice and
Rotation, OLAP models : MOLAP, ROLAP.
Introduction to Data Mining, Data Exploration and Preprocessing: Data
Mining Task Primitives, Architecture, Techniques, KDD process, Issues in Data
Mining, Applications of Data Mining, Data Exploration :Types of Attributes,
Statistical Description of Data, Data Visualization, Data Preprocessing: Cleaning,
3.0 10
Integration, Reduction: Attribute subset selection, Histograms, Clustering and
Sampling, Data Transformation & Data Discretization: Normalization, Binning,
Concept hierarchy generation, Concept Description: Attribute oriented Induction
for Data Characterization.

University of Mumbai, B. E. (Computer Engineering), Rev. 2016 41


 
Classification, Prediction and Clustering: Basic Concepts, Decision Tree using
Information Gain, Induction: Attribute Selection Measures, Tree pruning, Bayesian
Classification: Naive Bayes, Classifier Rule - Based Classification: Using IF-
THEN Rules for classification, Prediction: Simple linear regression, Multiple linear
4.0 12
regression Model Evaluation & Selection: Accuracy and Error measures, Holdout,
Random Sampling, Cross Validation, Bootstrap, Clustering: Distance Measures,
Partitioning Methods (k-Means, k-Medoids), Hierarchical Methods(Agglomerative,
Divisive)
Mining Frequent Patterns and Association Rules: Market Basket Analysis,
Frequent Item sets, Closed Item sets, and Association Rule, Frequent Pattern
Mining, Efficient and Scalable Frequent Item set Mining Methods: Apriori
5.0 8
Algorithm, Association Rule Generation, Improving the Efficiency of Apriori, FP
growth, Mining frequent Itemsets using Vertical Data Format, Introduction to
Mining Multilevel Association Rules and Multidimensional Association Rules
Spatial and Web Mining: Spatial Data, Spatial Vs. Classical Data Mining, Spatial
Data Structures, Mining Spatial Association and Co-location Patterns, Spatial
6.0 6
Clustering Techniques: CLARANS Extension, Web Mining: Web Content Mining,
Web Structure Mining, Web Usage mining, Applications of Web Mining
Total 52

Text Books:
1. PaulrajPonniah, ―Data Warehousing: Fundamentals for IT Professionals‖, Wiley India.
2. Han, Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann 3rd
edition.
3. ReemaTheraja ―Data warehousing‖, Oxford University Press.
4. M.H. Dunham, "Data Mining Introductory and Advanced Topics", Pearson
Education.

Reference Books:
1. Ian H. Witten, Eibe Frank and Mark A. Hall " Data Mining ", 3rd Edition Morgan kaufmann
publisher.
2. Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining", Person
Publisher.
3. R. Chattamvelli, "Data Mining Methods" 2nd Edition NarosaPublishing House.

Internal Assessment:
Assessment consists of two class tests of 20 marks each. The first class test is to be conducted when approx.
40% syllabus is completed and second class test when additional 40% syllabus is completed. Duration of
each test shall be one hour.

End Semester Theory Examination:


1. Question paper will comprise of 6 questions, each carrying 20 marks.
2. The students need to solve total 4 questions.
3. Question No.1 will be compulsory and based on entire syllabus.
4. Remaining question (Q.2 to Q.6) will be selected from all the modules.

University of Mumbai, B. E. (Computer Engineering), Rev. 2016 42


 

You might also like