0% found this document useful (0 votes)
19 views

Etl: Extract, Transform, Load: By: Manu Bagda CS-29

The document discusses Extract, Transform, Load (ETL) which involves three functions: extract, transform, and load. It defines each function, provides an overview of general ETL issues and the ETL process, discusses the work of ETL and the DW phases including design, loading, and refreshment. It also compares features of different ETL tools including Informatica PowerMart, Microsoft Data Transformation Services, and IBM DB2 Warehouse Manager.

Uploaded by

Manu Bagda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Etl: Extract, Transform, Load: By: Manu Bagda CS-29

The document discusses Extract, Transform, Load (ETL) which involves three functions: extract, transform, and load. It defines each function, provides an overview of general ETL issues and the ETL process, discusses the work of ETL and the DW phases including design, loading, and refreshment. It also compares features of different ETL tools including Informatica PowerMart, Microsoft Data Transformation Services, and IBM DB2 Warehouse Manager.

Uploaded by

Manu Bagda
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

ETL: EXTRACT, TRANSFORM,

LOAD
By:
Manu Bagda
CS-29
Extract, Transform and Load (ETL) : Definition

Three separate functions combined into one development tool:

1. Extract – Reads data from a specified source and extracts


a desired subset of data.

2. Transform – Uses rules or lookup tables, or creating


combinations with other data, to convert source
data to the desired state.

3. Load – Writes the resulting data to a target database.


ETL Overview

•General ETL issues


The ETL/DW refreshment process
Building dimensions
Building fact tables
Extract
Transformations/cleansing
Load
•MS Integration Services
A concrete ETL tool
Demo
Example ETL flow
The ETL Process

•The mostunderestimated process in DW development


•The most time-consuming process in DW development
Often, 80% of development time is spent on ETL
•Extract
Extract relevant data
•Transform
Transform data to DW format
Build keys, etc.
Cleansing of data
•Load
Load data into DW
Build aggregates, etc.
Work Of ETL
DW Phases

•Design phase
Modeling, DB design, source selection,…
•Loading phase
First load/population of the DW
Based on all data in sources
•Refreshment phase
Keep the DW up-to-date wrt. source data changes
ETL/DW Refreshment
Refreshment Workflow
ETL In The Architecture
Extract, Transform and Load (ETL) Tools
considered:

1.IBM DB2 Warehouse Manager

2.Informatica PowerMart

3.Microsoft Data Transformation Services (DTS)


ETL Tools: Features
Feature Informatica Microsoft IBM DB2
PowerMart DTS Warehouse Manager
Heterogeneous Multiple, but Multiple, DB2 Client Only DB2 without
Targets requires add-on Connect required Data Joiner
for DB2
Cost Prohibitive Free with Reasonable, but
SQL Server required add-ons
Strength of Excellent Excellent Weak
Client Tool
Join Dissimilar Yes Yes No
Objects
Vendor & Third No Third Party Numerous Web Virtually
Party Support Support sites Non-existent
Thank
you

You might also like