Data Warehoudingand Ab Initio Concepts
Data Warehoudingand Ab Initio Concepts
Ab Initio
Prepared By : Ashok Chanda
Accenture
Ab Initio Training
Ab initio Session 1
Introduction to DWH
Explanation of DW Architecture
Operating System / Hardware
Support
Introduction to ETL Process
Introduction to Ab Initio
Explanation of Ab Initio Architecture
Accenture
Ab Initio Training
Accenture
Ab Initio Training
Data Warehouse-Definitions
Accenture
Ab Initio Training
Data Warehouse
Accenture
Ab Initio Training
Accenture
Ab Initio Training
Simplified Datawarehouse
Architecture
Accenture
Ab Initio Training
Data warehouse
Architecture
Accenture
Ab Initio Training
Accenture
Ab Initio Training
Capture
Scrub or Data cleansing
Transform
Load and Index
Accenture
Ab Initio Training
10
ETL Technology
Accenture
Ab Initio Training
11
Accenture
Ab Initio Training
12
Accenture
Ab Initio Training
13
Data warehouse
Accenture
Ab Initio Training
14
Data Warehouse
Accenture
Ab Initio Training
15
Data Warehouse
Environment
In addition to a
relational/multidimensional database,
a data warehouse environment often
consists of an ETL solution, an OLAP
engine, client analysis tools, and
other applications that manage the
process of gathering data and
delivering it to business users.
Accenture
Ab Initio Training
16
Data Mart
Accenture
Ab Initio Training
17
Data Mart
Accenture
Ab Initio Training
18
Star Schema
Accenture
Ab Initio Training
19
Accenture
Ab Initio Training
20
Advantages of Star
Schemas
Accenture
Ab Initio Training
21
Star schema
Accenture
Ab Initio Training
22
Snowflake Schema
Accenture
Ab Initio Training
23
Accenture
Ab Initio Training
24
Diagrammatic
representation for
Snowflake Schema
Accenture
Ab Initio Training
25
Fact Table
The centralized table in a star schema
is called as FACT table. A fact table
typically has two types of columns:
those that contain facts and those
that are foreign keys to dimension
tables. The primary key of a fact table
is usually a composite key that is
made up of all of its foreign keys.
Accenture
Ab Initio Training
26
Accenture
Ab Initio Training
27
Accenture
Ab Initio Training
28
Accenture
Ab Initio Training
29
Operating System /
Hardware Support
Accenture
Ab Initio Training
30
Parallel Functionality
Accenture
Ab Initio Training
31
Parallel Features
An overview of typical parallel functionality is given below :
Queries Parallel queries can enhance scalability for many
query operations
Data load Performance is always a serious issue when
loading large databases. Meeting response time
requirements is the overriding factor for determining the
best load method and should be a key part of a
performance benchmark
Create table as select This feature makes it possible to
create aggregated tables in parallel
Index creation Parallel index creation exploits the
benefits of parallel hardware by distributing the workload
generated by a large index created for a large number of
processors .
Accenture
Ab Initio Training
32
Accenture
Ab Initio Training
33
Accenture
Ab Initio Training
34
A Multi-CPU Computer
(SMP)
Accenture
Ab Initio Training
35
A Network of Multi-CPU
Nodes
Accenture
Ab Initio Training
36
A Network of Networks
Accenture
Ab Initio Training
37
Accenture
Ab Initio Training
38
Introduction to Ab
Initio
Accenture
Ab Initio Training
39
History of Ab Initio
Accenture
Ab Initio Training
40
History of Ab Initio
Accenture
Ab Initio Training
41
Accenture
Ab Initio Training
42
Ab Initios focus
Moving Data
move small and large volumes of data in an
efficient manner
deal with the complexity associated with
business data
High Performance
Accenture
scalable solutions
Better productivity
Ab Initio Training
43
Ab Initios Software
Accenture
Ab Initio Training
44
Applications of Ab Initio
Software
Data transformation.
Accenture
Ab Initio Training
45
Accenture
Ab Initio Training
46
Applications of Ab Initio
Software in terms of Data
Warehouse
Accenture
Ab Initio Training
47
8. AbInitio doesn't need a dedicated administrator, UNIX or NT Admin will suffice, where as other ETL tools do have administrative work.
Accenture
Ab Initio Training
48
Accenture
Ab Initio Training
49
Accenture
Ab Initio Training
50
Ab Initio Product
Architecture
User
UserApplications
Applications
Development
DevelopmentEnvironments
Environments
GDE
Shell
GDE
Shell
Component
Component
Library
Library
User-defined
User-defined
Components
Components
3rd
3rdParty
Party
Components
Components
Ab
AbInitio
Initio
EME
EME
The
TheAb
AbInitio
InitioCo>Operating
Co>OperatingSystem
System
Native
NativeOperating
OperatingSystem
System(Unix,
(Unix,Windows,
Windows,OS/390)
OS/390)
Accenture
Ab Initio Training
51
Ab Initio ArchitectureExplanation
Accenture
Ab Initio Training
52
Co>Operating System
Services
Control
Data Transport
Accenture
Ab Initio Training
53
Ab Initio: What We Do
Accenture
Ab Initio Training
54
The Ab Initio
Co>Operating System
Accenture
Ab Initio Training
55
The Ab Initio
Co>Operating SystemContinued
The Ab Initio Co>Operating System
depends on parallelism to connect
(i.e.,
cooperate with) diverse databases. It
extracts,
transforms and loads data to and from
Teradata and other data sources.
Accenture
Ab Initio Training
56
GDE
Top Layer
Solaris,
AIX, NT,
Linux,
NCR
GDE
Co-Op System
GDE
GDE
Accenture
Ab Initio Training
57
Sun Solaris
IBM AIX
Hewlett-Packard HPUX
Siemens Pyramid
Reliant UNIX
IBM DYNIX/ptx
Silicon Graphics IRIX
Accenture
Ab Initio Training
58
Accenture
Ab Initio Training
59
Ab Initio Cooperating
System
Ab Initio Software Corporation, headquartered in Lexington, MA,
develops software solutions that process vast amounts of data (well
into the terabyte range) in a timely fashion by employing many
(often hundreds) of server processors in parallel. Major corporations
worldwide use Ab Initio software in mission critical, enterprise-wide,
data processing systems. Together, Teradata and Ab Initio
deliver:
End-to-end solutions for integrating and processing data
throughout
the enterprise
Software that is flexible, efficient, and robust, with unlimited
scalability
Professional and highly responsive support
The Co>Operating System executes your application by creating and
managing the processes and data flows that the components and
arrows represent.
Accenture
Ab Initio Training
60
Graphical Development
Environment GDE
Accenture
Ab Initio Training
61
The GDE
The Graphical Development Environment (GDE)
provides a graphical user interface into the
services of the Co>Operating System. The
Graphical Development Environment Enables
you to create applications by dragging and
dropping Components. Allows you to point and
click operations on executable flow charts. The
Co>Operating System can execute these
flowcharts directly. Graphical monitoring of
running applications allows you to quantify data
volumes and execution times, helping spot
opportunities for improving performance.
Accenture
Ab Initio Training
62
Accenture
Ab Initio Training
63
Accenture
Ab Initio Training
64
Components
Accenture
Ab Initio Training
65
Accenture
Ab Initio Training
66
EME
Accenture
Ab Initio Training
67
Benefits of EME
The Enterprise Meta>Environment provides a rich
store for applications and all of their associated
information including :
Technical Metadata-Applications related business
rules ,record formats and execution statistics
Business Metadata-User defined documentations of
job functions ,roles and responsibilities.
Metadata is data about data and is critical to
understanding and driving your business process
and computational resources .Storing and using
metadata is as important to your business as
storing and using data.
Accenture
Ab Initio Training
68
Accenture
Ab Initio Training
69
Accenture
Ab Initio Training
70
Stepwise explanation of Ab
Initio Architecture
Accenture
Ab Initio Training
71
Stepwise explanation of Ab
Initio Architecture continued
Accenture
Ab Initio Training
72
Accenture
Ab Initio Training
73
Accenture
Ab Initio Training
74
EME Interfaces
Accenture
Ab Initio Training
75
Thank You
End of Session 1
Accenture
Ab Initio Training
76