Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory
Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory
The term data mining refers loosely to the process of semi automatically analyzing
large database to find useful patterns.
Data mining attempts to discover rule and patterns from large amount of data.
Simply Data mining refers to extracting or “mining” knowledge from large amount
of data.
Many others term are also used in addition to data mining such as mining from
data, Knowledge extraction, data/pattern analysis.
Data Mining is a step of Knowledge Discovery in Databases (KDD) Process
Data cleaning
Data Integration
Data selection
Data Transformation
Data Mining
Pattern Evaluation
Knowledge representation.
Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as
synonyms.
Part B
Observation & Learning: In this experiment , we focus on the architecture of data warehousing and
techniques used for data mining.
Conclusion:. Over the next few years, the growth of data warehousing is going to be enormous with new
products and technologies coming out. It is going to be important that data warehouse planners and
developers have a clear idea of what they are looking for and then choose strategies and methods that will
provide them with performance
Questions:
1. Write the different tools used in Data Warehouse?
Redshift
Microsoft SQL server
PostgreSQL
MySQL
Microsoft Azure
Oracle
Skyvia
Xplenty
Alooma
Atom
The insights derived via Data Mining can be used for marketing, fraud detection, and
scientific discovery, etc.
Relational databases
Data warehouses
Advanced DB and information repositories
Object-oriented and object-relational databases
Transactional and Spatial databases
Heterogeneous and legacy databases
Multimedia and streaming database
Text databases
Text mining and Web mining
The staging area is mainly used to quickly extract data from its data sources, minimizing the
impact of the sources.
After data has been loaded into the staging area, the staging area is used to combine data
from multiple data sources, transformations, validations, data cleansing. Data is often
transformed into a star schema prior to loading a data warehouse.