0% found this document useful (0 votes)

725 views7 pages

Unit 1 Front Room Architecture

The document discusses the architecture of a data warehouse system, including the front room and back room components. The front room includes BI applications, interfaces, and management services that deliver insights to business users. The back room focuses on extracting, transforming, and loading data from source systems and includes ETL processes, management services, and data stores. Aggregates and aggregate navigation are discussed as important techniques for improving query performance at summary levels.

Uploaded by

Prathamesh Saraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

725 views7 pages

Unit 1 Front Room Architecture

Uploaded by

Prathamesh Saraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as ODT, PDF, TXT or read online on Scribd

You are on page 1/ 7

Unit 1

** Front Room Architecture:

The front room is the public face of the warehouse. It's what the business users see and work with day-to-
day.
BI Application Types:
The role of the data warehouse is to be the platform for business intelligence. The most important BI
application types include the following:

Direct access queries: The classic ad hoc requests initiated by business users from desktop query tool
applications.
Standard reports: Regularly scheduled reports typically delivered via the BI portal or as spreadsheets or
PDFs to an online library.
Analytic applications: Applications containing powerful analysis algorithms in addition to normal
database queries.
Dashboards and scorecards: multi-subject user interfaces showing key performance indicators (KPIs)
textually and graphically.

These BI applications are delivered to the business users through a variety of application interface
options, including:

BI portals and custom front ends: special purpose user interfaces to provide easy access to web based BI
applications or for specific complex queries and screen results.
Handheld device interfaces: special versions of BI applications engineered for handheld screens and input
devices.
Instantaneous BI (EII): an extreme form of real time data warehouse architecture with a direct connection
from the source transaction system to the user's screen.
BI Management Services:
BI management services run the gamut from shared services that typically reside between the presentation
server and the user to desktop services that are typically presented at the user level and mostly pertain to
report definition and results display.

Shared Services
Shared services include metadata services, security services, usage monitoring, query management,
enterprise reporting, and web and portal services

Metadata Services
A metadata model that describes the structure of the data warehouse for the tool's benefit, and simplifies
and enhances it for the user's understanding.

Security Services
Security services facilitate a user's connection to the database. Security services include authorization and
authentication services through which the user is identified and access rights are determined or access is
refused.

Usage Monitoring
Usage monitoring involves capturing information about the use of the data warehouse.

Query Management
Query management services are the capabilities that manage the translation of the user's specification of
the query on the screen into the query syntax submitted to the server, the execution of the query on the
database, and the return of the result set to the desktop.

Enterprise Reporting Services

Enterprise reporting provides the ability to create and deliver production style predefined reports that
have some level of user interaction, a broad audience, and regular execution schedules.

Web Access
Your front room architecture needs to provide users with web browser-based data access services

Portal Services
Portal tools usually leverage the web server to provide a more general user interface for accessing
organizational, communications, and presentation services.

BI Data Stores

Stored Reports
As data moves into the front room and closer to the user, it becomes more diffused. Users can generate
hundreds of ad hoc queries and reports in a day. These are typically centered on specific questions,
investigations of anomalies, or tracking the impact of a program or event. Most individual queries yield
result sets with fewer than 10,000 rows — a large percentage have fewer than 1,000 rows. These result
sets are stored in the BI tool, at least temporarily. Much of the time, the results are actually transferred
into a spreadsheet and analyzed further.

Application Server Caches

There are several data-oriented services in the front room. This is usually in the form of a local cache for
application logic or to provide lightning fast response time.

Local User Databases.

Disposable Analytic Data Stores
The disposable data store is a set of data created to support a specific short-lived business situation. It is
similar to the local data store, but it is intended to have a limited life span. For example, a company may
be launching a significant promotion or new product and want to set up a special launch control room

BI Metadata
Front room BI metadata includes the elements detailed in the following sections.
PROCESS METADATA

- Report and query execution statistics

- Network security usage statistics

TECHNICAL METADATA
- Standard query and report definitions
- BI semantic layer definition including business names for all tables and columns mapped to appropriate
presentation server objects, join paths, computed columns, and business groupings. May also include
aggregate navigation and drill across functionality.
- Application Logic
- Security groups and user assignments

BUSINESS METADATA
- Conformed attribute and fact definitions and business rules
- User documentation and training materials

** Back Room Architecture

The ETL process flow involves four major operations: extracting the data from the sources, running it
through a set of cleansing and conforming transformation processes, delivering it to the presentation
server, and managing the ETL process and back room environment.

Source Systems
It is a rare data warehouse, especially at the enterprise level, that does not pull data from multiple sources.

Extract
Most often, the challenge in the extract process is determining what data to extract and what kinds of
filters to apply. We all have stories about fields that have multiple uses, values that can't possibly exist,
payments being made from accounts that haven't been created, and other horror stories. From an
architecture point of view, you need to understand the requirements of the extract process so you can
determine what kinds of services will be needed. The extract-related ETL functions include:
- Data profiling
- Change data capture
- Extract system

Clean and Conform

Cleaning and conforming services are the core of the data quality work that takes place in the ETL
process. In this step, a range of transformations are performed to convert the data into something valuable
and presentable to the business. In one example, we had to run customer service data through more than
20 transformation steps to get it into a usable state. This involved steps like remapping the old activity
codes into the new codes, cleaning up the freeform entry fields, and populating a dummy customer ID for
pre-sales inquiries. There are five major services in the cleaning and conforming step:
- Data cleansing system
- Error event tracking
- Audit dimension creation
- Deduplicating
- Conforming

Deliver
Once the data is properly cleaned and aligned, the next step in the ETL process involves preparing the
data for user consumption and delivering it to the presentation servers.

The delivery subsystems in the ETL back room consist of:

- Slowly changing dimension (SCD) manager
- Surrogate key generator
- Hierarchy manager
- Special dimensions manager
- Fact table builders
- Surrogate key pipeline
- Multi-valued bridge table builder
- Late arriving data handler
- Dimension manager system
- Fact table provider system
- Aggregate builder
- OLAP cube builder
- Data propagation manager

ETL Management Services

Management services, some of which are actively involved in the ETL flow, like the job scheduler,
and some of which are part of the general development environment, like security.

- Job scheduler
- Backup system
- Recovery and restart
- Version control
- Version migration
- Workflow monitor
- Sorting
- Lineage and dependency
- Problem escalation
- Paralleling and pipelining
- Compliance manager
- Security
- Metadata repository

ETL Data Stores

Data stores are the temporary or permanent landing places for data across the DW/BI system. The actual
data stores you need depend on your business requirements, the stability of your source systems, and the
complexity of your extract and transformation processes.

ETL Metadata

PROCESS METADATA

- ETL operations statistics

- Audit results including checksums and other measures of quality and
completeness.
- Quality screen results

TECHNICAL METADATA

- System inventory
- Source descriptions of all data sources, including record layouts, column definitions, and business rules.
- Source access methods
- ETL data store specifications and DDL scripts
- ETL data store policies and procedures
- ETL job logic, extract and transforms

BUSINESS METADATA

- Data quality screen specifications

- Data dictionary
- Logical data map showing the overall data flow from source tables and fields through the ETL system to
target tables and columns.
- Business rule logic describing all business rules that are either explicitly checked or implemented in the
data warehouse, including slowly changing dimension policies and null handling.

** Explain aggregates and aggregates navigation.

Aggregates
Unfortunately, most organizations have fairly large datasets; at least large enough so users would have to
wait a relatively long time for any summary level query to return. In order to improve performance at
summary levels, we add the second element of the presentation server layer: aggregates. Pre-aggregating
data during the load process is one of the primary tools available to improve performance for analytic
queries. These aggregates occupy a separate logical layer, but they could be implemented in the
relational database, in an OLAP server, or on a separate application server.
Aggregates are like indexes. They will be built and rebuilt on a daily basis; the choice of aggregates will
change over time based on analysis of actual query usage. Your architecture will need to include
functionality to track aggregate usage to support this. Ideally, the aggregate navigation system will do this
for you, and automatically adjust the aggregates it creates. We call this usage based optimization. This is
also why it's a good idea to have your atomic data stored in a solid, reliable, flexible relational
database.

Although we refer to this layer as aggregates, the actual data structures may also include detail level data
for performance purposes. Some OLAP engines, for example, perform much faster when retrieving data
from the OLAP database rather than drilling through to the relational engine. In this case, if the OLAP
engine can hold the detail, it makes sense to put it in the OLAP database along with the aggregates. We
encourage you to think of the aggregate layer as essentially a fat index.

Aggregate Navigation
Having aggregates and atomic data increases the complexity of the data environment. Therefore, you
must provide a way to insulate the users from this complexity. As we said earlier, aggregates are like
indexes; they are a tool to improve performance, and they should be transparent to user queries and BI
application developers. This leads us to the third essential component of the presentation server: the
aggregate navigator. Presuming you create aggregates for performance, your architecture must include
aggregate navigation functionality. The aggregate navigator receives a user query based on the atomic
level dimensional model. It examines the query to see if it can be answered using a smaller, aggregate
table. If so, the query is rewritten to work against the aggregate table and submitted to the database
engine. The results are returned to the user, who is happy with such fast performance and unaware of the
magic it took to deliver it. At the implementation level, there are a range of technologies to provide
aggregate navigation functionality, including:
- OLAP engines
- Materialized views in the relational database with optimizer-based navigation
- Relational OLAP (ROLAP) services
- BI application servers or query tools
Many of these technologies include functionality to build and host the aggregates. In the case of an OLAP
engine, these aggregates are typically kept in a separate server, often running on a separate machine.

** Explain need of Metadata Integration.

A single integrated repository for DW/BI system metadata would be valuable in
several ways, if it were possible to build. Chief among these are impact analysis, audit
and documentation, and metadata quality management.

Impact Analysis
First, an integrated repository could help you identify the impact of making a change to the DW/BI
system.A change to the source system data model would impact the ETLprocess, and may cause a change
to the target data model, which would then impact any database definitions based on that element, like
indexes, partitions, and aggregates. It would also impact any reports that include that element. If all the
metadata is in one place, or at least connected by common keys and IDs, then it would be fairly easy to
understand the impact of this change.

Audit and Documentation

lineage analysis picks an element and determines where it came from and what went into its creation. This
is particularly important for understanding the contents and source of a given column, table, or other
object in the DW/BI system; it is essentially system generated documentation. In its most rigorous form,
lineage analysis can use the auditmetadata to determine the origin of any given fact or dimension row. In
some compliance scenarios, this is required information.

Metadata Quality and Management

Multiple copies of metadata elements kept in different systems will invariably get out of sync. For the
DW/BI system structures and processes, this kind of error is selfidentifying because the next time a
process runs or the structure is referenced, the action will fail. The DW/BI developer could have spotted
this ahead of time with an impact analysis report if one were available. Errors in descriptive or business
metadata synchronization are not so obvious. For example, if the data steward updates the description of a
customer dimension attribute in the data model, it may not be updated in any of the other half dozen or so
repositories around the DW/BI system that hold copies of this description. A user will still see the old
description in the BI tool metadata. The query will still work, but the user's understanding of the result
may be incorrect. A single repository would have one entry for the description of this field, which is then
referred to wherever it is needed. A change in the single entry automatically means everyone will see the
new value.

** Explain business, process and technical metadata at front room of BI system

(Refer Front Room Ans).

Vishal
No ratings yet
Vishal
70 pages
GaBiModellingPrinciples
No ratings yet
GaBiModellingPrinciples
141 pages
Building Single Page App With ASP - NET MVC 5 and Angular (PDFDrive)
No ratings yet
Building Single Page App With ASP - NET MVC 5 and Angular (PDFDrive)
192 pages
Library Management System
No ratings yet
Library Management System
34 pages
Syllabus For All Sems
No ratings yet
Syllabus For All Sems
115 pages
Compact Representation of Frequent Item Set
No ratings yet
Compact Representation of Frequent Item Set
59 pages
Introduction To DBMS
No ratings yet
Introduction To DBMS
17 pages
Generative AI Report
No ratings yet
Generative AI Report
15 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Jedidah N Wairiuko
No ratings yet
Jedidah N Wairiuko
35 pages
BCS Midterm
No ratings yet
BCS Midterm
4 pages
DMDW Ref
No ratings yet
DMDW Ref
26 pages
Info Written Exam 20156autum
No ratings yet
Info Written Exam 20156autum
2 pages
SNPV 1651667096
No ratings yet
SNPV 1651667096
15 pages
Resume Shree
No ratings yet
Resume Shree
3 pages
Cucm - B - Upgrade and Migration Guide 1201
No ratings yet
Cucm - B - Upgrade and Migration Guide 1201
158 pages
Surekha Achanta_Java
No ratings yet
Surekha Achanta_Java
4 pages
Automated Oracle E-Business Suite R12.1 Cloning (With Notes-AJM)
No ratings yet
Automated Oracle E-Business Suite R12.1 Cloning (With Notes-AJM)
58 pages
Dinesh Pythondeveloper Resume (1)
No ratings yet
Dinesh Pythondeveloper Resume (1)
1 page
NCR Aloha Table Service v15.1 Reference Guide
No ratings yet
NCR Aloha Table Service v15.1 Reference Guide
542 pages
05-BI Framework and Components
No ratings yet
05-BI Framework and Components
22 pages
Zena Abebe MBA Section H Final Thesis Paper
No ratings yet
Zena Abebe MBA Section H Final Thesis Paper
80 pages
Web Technology Lab
No ratings yet
Web Technology Lab
50 pages
Bca 204 3rd Database
No ratings yet
Bca 204 3rd Database
228 pages
Advanced SQL
No ratings yet
Advanced SQL
45 pages
02 DW
No ratings yet
02 DW
84 pages
BCS306B Manual
No ratings yet
BCS306B Manual
27 pages
Information Retrieval
100% (1)
Information Retrieval
11 pages
DBMS Module-2-Notes - Normalization
No ratings yet
DBMS Module-2-Notes - Normalization
18 pages
06-E-SAT Including Data Management and Use of Results
100% (2)
06-E-SAT Including Data Management and Use of Results
53 pages
Web Programming Lab Programs Vi Bca Questions Answer Web Programming Lab Programs Vi Bca Questions Answer
No ratings yet
Web Programming Lab Programs Vi Bca Questions Answer Web Programming Lab Programs Vi Bca Questions Answer
20 pages
Gtu Micro Processor Practical
100% (1)
Gtu Micro Processor Practical
79 pages
Neuroanatomical Terminology - A Lexicon of Classical Origins and Historical Foundations-Oxford University Press (2014) PDF
100% (3)
Neuroanatomical Terminology - A Lexicon of Classical Origins and Historical Foundations-Oxford University Press (2014) PDF
1,069 pages
Final Lab Manual Fs 2021
100% (1)
Final Lab Manual Fs 2021
29 pages
College Management System: By: Monil Paghdar
No ratings yet
College Management System: By: Monil Paghdar
40 pages
Seminar Final Report
No ratings yet
Seminar Final Report
26 pages
DAA Lab Manual - 21cs42-Final
No ratings yet
DAA Lab Manual - 21cs42-Final
34 pages
Advanced Java Important Questions
No ratings yet
Advanced Java Important Questions
7 pages
Hospital Management System
75% (4)
Hospital Management System
148 pages
4th Sem Syllabus of RGPV Bhopal Cse
No ratings yet
4th Sem Syllabus of RGPV Bhopal Cse
14 pages
18IS61 FSmodule1 Notes
No ratings yet
18IS61 FSmodule1 Notes
40 pages
PRACTICAL 5 Subnetting 3
No ratings yet
PRACTICAL 5 Subnetting 3
25 pages
Laboratory Manual: Government Polytechnic Porbandar
No ratings yet
Laboratory Manual: Government Polytechnic Porbandar
88 pages
Crowd Sourcing Analytics
100% (1)
Crowd Sourcing Analytics
27 pages
BCA603T Cryptography and Network Security: Unit - I Contents
No ratings yet
BCA603T Cryptography and Network Security: Unit - I Contents
42 pages
Input Output Organization
No ratings yet
Input Output Organization
19 pages
CA7 33 DBMaint
No ratings yet
CA7 33 DBMaint
418 pages
An XML File Which Will Display The Book Information and DTD
No ratings yet
An XML File Which Will Display The Book Information and DTD
7 pages
Eserver I5 and Db2: Business Intelligence Concepts
No ratings yet
Eserver I5 and Db2: Business Intelligence Concepts
12 pages
IBM Global Services: User Exits
No ratings yet
IBM Global Services: User Exits
30 pages
CSDF FlyHigh Services
No ratings yet
CSDF FlyHigh Services
8 pages
Software Testing Strategy - XXXXXX
100% (1)
Software Testing Strategy - XXXXXX
13 pages
02-dw Architecture
No ratings yet
02-dw Architecture
31 pages
Agriculture Management System-3
No ratings yet
Agriculture Management System-3
22 pages
Candidate Generation and Pruning
No ratings yet
Candidate Generation and Pruning
9 pages
DWDM Notes/Unit 1
No ratings yet
DWDM Notes/Unit 1
31 pages
Top Network & Cyber Security Viva Question With Answer
0% (1)
Top Network & Cyber Security Viva Question With Answer
5 pages
Web Technology
No ratings yet
Web Technology
51 pages
TPC C
No ratings yet
TPC C
2 pages
CS3492 Database Management Systems Question Bank 1
No ratings yet
CS3492 Database Management Systems Question Bank 1
11 pages
Introduction: Data Analytic Thinking
No ratings yet
Introduction: Data Analytic Thinking
38 pages
15CS45 OOC Solutions - Jun - July 2018
No ratings yet
15CS45 OOC Solutions - Jun - July 2018
19 pages
Job Recommender Java Spring Boot
No ratings yet
Job Recommender Java Spring Boot
21 pages
Super Important Questions For BDA-18CS72: Module-1
No ratings yet
Super Important Questions For BDA-18CS72: Module-1
2 pages
CS2032 2 Marks & 16 Marks With Answers
100% (1)
CS2032 2 Marks & 16 Marks With Answers
30 pages
Railway Booking System Design
No ratings yet
Railway Booking System Design
7 pages
Fs Lab Manual
No ratings yet
Fs Lab Manual
57 pages
San 18cs822 Module Wise Questions
No ratings yet
San 18cs822 Module Wise Questions
3 pages
Question Bank Bsc. Ii Year Data Structures Short Answers Type Questions
No ratings yet
Question Bank Bsc. Ii Year Data Structures Short Answers Type Questions
3 pages
Deep Learning KCS078
0% (1)
Deep Learning KCS078
2 pages
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
No ratings yet
CS 606 Skill Dev Lab - 7TO 10 - 1648109707
12 pages
Part A 1. Determine The GCD (24140,16762) Using Euclid's Algorithm. (A/M-2017)
No ratings yet
Part A 1. Determine The GCD (24140,16762) Using Euclid's Algorithm. (A/M-2017)
36 pages
OBJECT ORIENTED SYSTEM DESIGN Question Paper 21 22
No ratings yet
OBJECT ORIENTED SYSTEM DESIGN Question Paper 21 22
3 pages
CS1403 CASE Tools Lab Manual
100% (2)
CS1403 CASE Tools Lab Manual
67 pages
Dbms Question Bank Unit I
100% (1)
Dbms Question Bank Unit I
2 pages
Module-4 Normalization: Database Design Theory DBMS (18CS53)
No ratings yet
Module-4 Normalization: Database Design Theory DBMS (18CS53)
24 pages
Guidelines For The Preparation of 8Th Sem B.E./B. Tech. Project Reports
No ratings yet
Guidelines For The Preparation of 8Th Sem B.E./B. Tech. Project Reports
4 pages
14 Marks Imp Questions
No ratings yet
14 Marks Imp Questions
4 pages
Latest Year KPIT Technologies Technical Test Question Paper
No ratings yet
Latest Year KPIT Technologies Technical Test Question Paper
7 pages
Basis Path Testing
No ratings yet
Basis Path Testing
4 pages
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
No ratings yet
Unit 4 - Software Engineering - WWW - Rgpvnotes.in
12 pages
VTU SADP Question Paper
50% (2)
VTU SADP Question Paper
2 pages
Doc
No ratings yet
Doc
2 pages
Cs2358 Internet Programming Lab Anna University Syllabus
No ratings yet
Cs2358 Internet Programming Lab Anna University Syllabus
12 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
2 pages
16 Mark Questions OOAD
100% (2)
16 Mark Questions OOAD
9 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
From Everand
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint
Brian Knight
3/5 (1)
Introduction to Linux: Installation and Programming
From Everand
Introduction to Linux: Installation and Programming
N. B. Venkateswarlu
No ratings yet

Unit 1 Front Room Architecture

Uploaded by

Unit 1 Front Room Architecture

Uploaded by

Unit 1

** Front Room Architecture:

Enterprise Reporting Services

Application Server Caches

Local User Databases.

- Report and query execution statistics

** Back Room Architecture

Clean and Conform

The delivery subsystems in the ETL back room consist of:

ETL Management Services

ETL Data Stores

- ETL operations statistics

- Data quality screen specifications

** Explain aggregates and aggregates navigation.

** Explain need of Metadata Integration.

Audit and Documentation

Metadata Quality and Management

** Explain business, process and technical metadata at front room of BI system

You might also like