0% found this document useful (0 votes)
22 views43 pages

Chapter 3 - Data Warehousing[1]

Uploaded by

Ansar Hasas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views43 pages

Chapter 3 - Data Warehousing[1]

Uploaded by

Ansar Hasas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

1

March 2024

Introduction to Data Warehousing

Introduction to Data Warehousing

S. Hassan Adelyar, Ph.D


Instructor of Computer Science Faculty
Data Warehousing

Kabul University

March 2024

06:24:47 A
M
2
Data warehouse March 2024

Introduction to Data Warehousing


 A subject-oriented, integrated, time-variant,
& non-volatile collection of data in support of
management’s decision-making process.
 An enterprise system used for the analysis &
reporting of structured & semi-structured
data.
 Receives data periodically & on a regular
Data Warehousing

basis from multiple sources such as:


 Point-of-sale transactions

 Marketing automation
3
Data warehouse March 2024

Introduction to Data Warehousing


 Relational databases
 Customer relationship management

 Operational sources

 External data sources

 Websites

 Store both current & historical data in one


Data Warehousing

place & is designed to give a long-range view


of data over time, supports business
intelligence (BI) activities, specifically
analysis.
4
March 2024

Introduction to Data Warehousing


 This data is then made available for decision-
makers to access & analyze.
 A data warehouse is not a single software or
hardware product you purchase to provide
strategic information.
 It is a computing environment where users can
find strategic information, & users are put
Data Warehousing

directly in touch with the data they need to


make better decisions.
 It is a user-centric environment.
5
March 2024

Introduction to Data Warehousing


 Answer questions users have about the business,
the performance of the various operations, the
business trends, & about what can be done to
improve the business.
Data Warehousing
6
A Blend of Many Technologies March 2024

Introduction to Data Warehousing


 The environment for data warehouses &
marts includes the following:
 Data integration technology & processes

that are needed to prepare the data for use;


 Different tools & applications for a variety

of users;
 The basic concept of data warehousing is:
Data Warehousing

 Take all the data from the operational

systems.
7
March 2024

Introduction to Data Warehousing


 Integrate all the data from the various
sources.
 Remove inconsistencies & transform the

data.
 Store the data in formats suitable for easy

access for decision making.


 Figure 1-9 shows how a data warehouse is a
Data Warehousing

blend of the many technologies.


8 Figure 1-9 The data warehouse: a blend of technologies
March 2024

Introduction to Data Warehousing


Data Warehousing
9
Data warehouse architecture March 2024

Introduction to Data Warehousing


 Every data warehouse has three fundamental
components:
 Load Manager

 Warehouse Manager

 Data Access Manager


Data Warehousing
10
March 2024

Introduction to Data Warehousing


 Load manager
 Responsible for Data collection from

operational systems.
 Performs data conversion into some usable

form to be further utilized by the user.


 Includes all the programs & application

interfaces which are required for extracting


Data Warehousing

data from the operational systems.


11
March 2024

Introduction to Data Warehousing


 It should perform the following tasks:
 Data Identification

 Data Validation for its accuracy

 Data Extraction from the original source

 Data Cleansing

 Data formatting
Data Warehousing

 Consolidates data from multiple sources to

one place
12
March 2024

Introduction to Data Warehousing


 Warehouse manager
 The main part of Data Warehousing

system.
 Holds the massive amount of information

from many sources.


 Organizes data in a way so it becomes easy

for anyone to analyze or find the required


Data Warehousing

information.
13
Architecture of a data warehouse March 2024

Introduction to Data Warehousing


Data Warehousing
14
Database vs. Data Warehouse March 2024

Introduction to Data Warehousing


 Database:
 The main difference is that in a database,

data is collected for multiple transactional


purposes.
 Databases provide real-time data.

 Data Warehouse:
 In a data warehouse, data is collected on an
Data Warehousing

extensive scale to perform analytics.


 Data warehouses store data to be accessed

for big analytical queries.


15
Data warehouse usages March 2024

Introduction to Data Warehousing


 Most common data warehouse usages are:
 Making real-time decisions:

 Analyze data in real time to proactively

address challenges, identify


opportunities, gain efficiency, reduce
costs, & proactively respond to business
events.
Data Warehousing
16
March 2024

Introduction to Data Warehousing


 Consolidating siloed data:
 Quickly pull data from multiple

structured sources across your


organization, such as point-of-sale
systems, websites, & email lists, & bring
it together into one location so that you
can perform analysis & get insights.
Data Warehousing
17
March 2024

Introduction to Data Warehousing


 Enabling business reporting & ad hoc
analysis:
 Keep historical data on a separate server

from operational data so that end users


can access it & run their own queries &
reports without impacting the
performance of operational systems or
Data Warehousing

waiting to get help from IT.


18
March 2024

Introduction to Data Warehousing


 Implementing machine learning & AI:
 Collect historical & real-time data to

develop algorithms that can provide


predictive insights, such as anticipating
traffic points or suggesting relevant
products to a customer browsing a
website.
Data Warehousing
19
March 2024

Introduction to Data Warehousing


 If your organization has or does any of the
following, you’re probably a good candidate
for a data warehouse:
 Multiple sources of disparate data

 Big-data analysis & visualization

 Machine learning models & other AI-

driven processes
Data Warehousing

 Custom report generation & ad hoc

analysis
20
Types of Data Warehouse March 2024

Introduction to Data Warehousing


 Enterprise Data Warehouse (EDW)
 This type of warehouse serves as a key or

central database that facilitates decision-


support services throughout the enterprise.
 The advantage to this type of warehouse is

that it provides access to cross-


organizational information, offers a unified
Data Warehousing

approach to data representation, & allows


running complex queries.
21
March 2024

Introduction to Data Warehousing


 Operational Data Store (ODS)
 This type of data warehouse refreshes in

real-time. It is often preferred for routine


activities like storing employee records. It is
required when data warehouse systems do
not support reporting needs of the business.
 Data Mart
Data Warehousing

 A data mart is a subset of a data warehouse

built to maintain a particular department,


region, or business unit.
22
March 2024

Introduction to Data Warehousing


 Every department of a business has a central
repository or data mart to store data.
 The data from the data mart is stored in the
ODS periodically.
 The ODS then sends the data to the EDW,
where it is stored & used.
Data Warehousing
23
Evolution of Business Intelligence (BI) March 2024

Introduction to Data Warehousing


 Business intelligence for an organization
requires two environments :
 Transformation of data to information;

 Derivation of knowledge from information.

 Business intelligence (BI), therefore, is a broad


group of applications & technologies.
 First, the term refers to the systems &
Data Warehousing

technologies for gathering, cleaning,


consolidating, & storing corporate data.
24
March 2024

Introduction to Data Warehousing


 Next, business intelligence (BI) relates to the
tools, techniques, & applications for analyzing
the stored data.
 BI is an umbrella term to include concepts &
methods to improve business decision making
by fact-based support systems.
Data Warehousing
25
BI: Two Environments March 2024

Introduction to Data Warehousing


 When you consider all that BI encompasses,
you may view BI for an enterprise as composed
of two environments:
 Data to Information

 In this environment data from multiple

operational systems are extracted,


integrated, cleansed, transformed &
Data Warehousing

stored as information in specially


designed repositories.
26
March 2024

Introduction to Data Warehousing


 Information to Knowledge
 In this environment analytical tools are

made available to users to access &


analyze the information content in the
specially designed repositories & turn
information into knowledge.
Data Warehousing
27
March 2024

Introduction to Data Warehousing


 Figure 1-10 shows the two complementary
environments, the data warehousing
environment, which transforms data into
information, & the analytical environment,
which produces knowledge from information.
Data Warehousing
28 Figure 1-10 BI: data warehousing & analytical environments
March 2024

Introduction to Data Warehousing


Data Warehousing
29
March 2024

Introduction to Data Warehousing


 Common functions of business intelligence
technologies include:
 Reporting

 Online analytical processing

 Data mining

 Process mining
Data Warehousing

 Complex event processing

 Business performance management


30
March 2024

Introduction to Data Warehousing


 Text mining
 Predictive analytics
 Prescriptive analytics
Data Warehousing
31 Traditional vs. cloud-based data warehouse
March 2024

Introduction to Data Warehousing


 Traditional data warehouses:
 Hosted on-premises, with data flowing in

from relational databases, transactional


systems, business applications, & other
source systems.
 Typically designed to capture a subset of

data in batches & store it, making them


Data Warehousing

unsuitable for unstructured queries or real-


time analysis.
32
March 2024

Introduction to Data Warehousing


 Companies also must purchase their own
hardware & software with an on-premises
data warehouse, making it expensive to
scale & maintain.
 Storage is typically limited compared to
compute, so data is transformed quickly &
then discarded to keep storage space free.
Data Warehousing
33
March 2024

Introduction to Data Warehousing


 Cloud-based data warehouse:
 Today’s data analytics activities have

transformed to the center of all core


business activities, including revenue
generation, cost containment, improving
operations, & enhancing customer
experiences.
Data Warehousing
34
March 2024

Introduction to Data Warehousing


 As data evolves & diversifies, organizations
need more robust data warehouse solutions
& advanced analytic tools for storing,
managing, & analyzing large quantities of
data across their organizations.
 These systems must be scalable, reliable,
secure enough for regulated industries, &
Data Warehousing

flexible enough to support a wide variety of


data types & big data use cases.
35
Architecture of a Data Warehouse March 2024

Introduction to Data Warehousing


 The data stored in the warehouse is uploaded
from the operational systems.
 There are two main approaches used to build a
data warehouse system:
 Extract, transform, load (ETL)

 Extract, load, transform (ELT)


Data Warehousing
36
Key Characteristics of Data WarehouseMarch 2024
Introduction to Data Warehousing
 Subject-Oriented
 A data warehouse is subject-oriented since

it provides topic-wise information rather


than the overall processes of a business.
 Such subjects may be sales, promotion,

inventory, etc.
 For example, if you want to analyze your
Data Warehousing

company’s sales data, you need to build a


data warehouse that concentrates on sales.
37
March 2024

Introduction to Data Warehousing


 Such a warehouse would provide valuable
information like ‘who was your best
customer last year?’ or ‘who is likely to be
your best customer in the coming year?’
Data Warehousing
38
March 2024

Introduction to Data Warehousing


 Integrated
 A data warehouse is developed by

integrating data from varied sources into a


consistent format.
 The data must be stored in the warehouse in

a consistent & universally acceptable


manner in terms of naming, format, &
Data Warehousing

coding.
 This facilitates effective data analysis.
39
March 2024

Introduction to Data Warehousing


 Non-Volatile
 Data once entered into a data warehouse

must remain unchanged.


 All data is read-only.

 Previous data is not erased when current

data is entered.
 This helps you to analyze what has
Data Warehousing

happened & when.


40
March 2024

Introduction to Data Warehousing


 Time-Variant
 The data stored in a data warehouse is

documented with an element of time, either


explicitly or implicitly.
 An example of time variance in Data

Warehouse is exhibited in the Primary Key,


which must have an element of time like the
Data Warehousing

day, week, or month.


41
Data Warehousing Tools March 2024

Introduction to Data Warehousing


 Data warehouse tools are software
components used to perform several
operations on an extensive data set.
 These tools help to collect, read, write &
transfer data from various sources.
 Data warehouses support are designed to
support operations like data sorting, filtering,
Data Warehousing

merging, etc.
 Data warehouse applications can be
categorized as:
42
March 2024

Introduction to Data Warehousing


 Query & reporting tools
 Application Development tools
 Data mining tools
 OLAP tools
 Some popular data warehouse tools are
Xplenty, Amazon Redshift, Teradata,
Oracle 12c, Informatica, IBM Infosphere,
Data Warehousing

Cloudera, & Panoply.


End of Chapter 3

Question / Discussion?

You might also like