0% found this document useful (0 votes)
86 views

Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory

The document discusses data warehousing and data mining architectures. It provides an overview of data warehousing, describing how data is stored across distributed databases to improve access and processing. It defines data mining as analyzing data to find useful patterns and knowledge. The document also outlines the key steps in the knowledge discovery process including data cleaning, integration, selection, transformation, and pattern evaluation.

Uploaded by

khushi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Part A Aim: Prerequisite: Database Outcome: To Impart Knowledge of Data Warehouse and Data Mining Theory

The document discusses data warehousing and data mining architectures. It provides an overview of data warehousing, describing how data is stored across distributed databases to improve access and processing. It defines data mining as analyzing data to find useful patterns and knowledge. The document also outlines the key steps in the knowledge discovery process including data cleaning, integration, selection, transformation, and pattern evaluation.

Uploaded by

khushi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Part A

Aim: Study of Data warehouse and Data Mining architecture.


Prerequisite: Database
Outcome: To impart knowledge of Data warehouse and Data Mining
Theory:
A data warehouse is a collection of databases that work together. ... Distributed
databases are used to store a database at multiple computer sites to improve data access
and processing. Data mining is the process of analysing data and summarizing it to
produce useful information.
Data warehouse Architecture

 The term data mining refers loosely to the process of semi automatically analyzing
large database to find useful patterns.
 Data mining attempts to discover rule and patterns from large amount of data.
 Simply Data mining refers to extracting or “mining” knowledge from large amount
of data.
 Many others term are also used in addition to data mining such as mining from
data, Knowledge extraction, data/pattern analysis.
 Data Mining is a step of Knowledge Discovery in Databases (KDD) Process
 Data cleaning
 Data Integration
 Data selection
 Data Transformation
 Data Mining
 Pattern Evaluation
 Knowledge representation.
 Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as
synonyms.

Name: - KHUSHI JAIN Class: - B. Tech (IT) Roll No. : - A218

Part B
Observation & Learning: In this experiment , we focus on the architecture of data warehousing and
techniques used for data mining.
Conclusion:. Over the next few years, the growth of data warehousing is going to be enormous with new
products and technologies coming out. It is going to be important that data warehouse planners and
developers have a clear idea of what they are looking for and then choose strategies and methods that will
provide them with performance

Questions:
1. Write the different tools used in Data Warehouse?
 Redshift
 Microsoft SQL server
 PostgreSQL
 MySQL
 Microsoft Azure
 Oracle
 Skyvia
 Xplenty
 Alooma
 Atom

2. Difference between OLAP and OLTP?


Online Analytical Processing (OLAP) –
Online Analytical Processing consists of a type of software tools that are used for
data analysis for business decisions. OLAP provides an environment to get
insights from the database retrieved from multiple database systems at one time.
Examples – Any type of Data warehouse system is an OLAP system. Uses of
OLAP are as follows:
 Spottily analyzed songs by users to come up with the personalized
homepage of their songs and playlist.
 Netflix movie recommendation system.
Online transaction processing (OLTP) –
Online transaction processing provides transaction-oriented applications in a 3-tier
architecture. OLTP administers day to day transaction of an organization.
Examples – Uses of OLTP are as follows:
 ATM center is an OLTP application.
 OLTP handles the ACID properties during data transaction via the
application.
 It’s also used for Online banking, Online airline ticket boo king, sending a text
message, add a book to the shopping cart.

3. Difference between OLTP and Data warehouse ?

4. What are different databases used in mining?


Data mining is looking for hidden, valid, and potentially useful patterns in huge data sets.
Data Mining is all about discovering unsuspected/ previously unknown relationships
amongst the data.

It is a multi-disciplinary skill that uses machine learning, statistics, AI and database


technology.

The insights derived via Data Mining can be used for marketing, fraud detection, and
scientific discovery, etc.

Data mining is also called as Knowledge discovery, Knowledge extraction, data/pattern


analysis, information harvesting, etc.Data mining can be performed on following types of
data

 Relational databases
 Data warehouses
 Advanced DB and information repositories
 Object-oriented and object-relational databases
 Transactional and Spatial databases
 Heterogeneous and legacy databases
 Multimedia and streaming database
 Text databases
 Text mining and Web mining

5. What is data staging?


 A data staging area (DSA) is a temporary storage area between the data sources and a data
warehouse.

The staging area is mainly used to quickly extract data from its data sources, minimizing the
impact of the sources.

After data has been loaded into the staging area, the staging area is used to combine data
from multiple data sources, transformations, validations, data cleansing. Data is often
transformed into a star schema prior to loading a data warehouse.

You might also like