0% found this document useful (0 votes)

362 views11 pages

Data Governance Book

Uploaded by

Abhishek Prasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

362 views11 pages

Data Governance Book

Uploaded by

Abhishek Prasad

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Data Lineage

Related terms:

Data Governance, Data Model, Master Data Management, Metadata, Big Data
Project, Metadata Management

View all Topics

Data Quality Management

Mark Allen, Dalton Cervo, in Multi-Domain Master Data Management, 2015

Data Lineage and Traceability

Data lineage states where data is coming from, where it is going, and what transfor-
mations are applied to it as it ﬂows through multiple processes. It helps understand
the data life cycle. It is one of the most critical pieces of information from a metadata
management point of view, as will be described in Chapter 10.

From data-quality and data-governance perspectives, it is important to understand

data lineage to ensure that existing business rules exist where expected, calcula-
tion rules and other transformations are correct, and system inputs and outputs
are compatible. Data traceability is the actual exercise to track access, values, and
changes to the data as they ﬂow through their lineage. Data traceability can be used
for data validation and veriﬁcation as well as data auditing. In summary, data lineage
is the documentation of the data life cycle, while data traceability is the process of
evaluating that the data is following its life cycle as expected.

Many data-quality projects will require data traceability to track information and
ensure that its usage is proper. Newly deployed or replaced interfaces might benefit
from a data traceability effort to verify that their role within the life cycle is seamless
or evaluate whether it affects other intermediate components. Data traceability
might also be required in an auditing project to demonstrate transparency, com-
pliance, and adherence to regulations.

> Read full chapter

Data Management, Models, and Meta-
data
Laura Sebastian-Coleman, in Measuring Data Quality for Ongoing Improvement,
2013

Data Lineage and Data Provenance

Data lineage is related to both the data chain and the information life cycle. The word
lineage refers to a pedigree or line of descent from an ancestor. In biology, a lineage
is a sequence of species that is considered to have evolved from a common ancestor.
But we also think of lineage in terms of direct inheritance from an immediate
predecessor. Most people concerned with the lineage of data want to understand two
aspects of it. First, they want to know the data’s origin or provenance—the earliest
instance of the data. (The word provenance in art has implications similar to lineage;
it refers to a record of ownership that can be used as a guide for a work’s authenticity
or quality.) Second, people want to know how (and sometimes why) the data has
changed since that earliest instance. Change can take place within one system or
between systems.

Understanding changes in data requires understanding the data chain, the rules
that have been applied to data as it moves along the data chain, and what eﬀects the
rules have had on the data. Data lineage includes the concept of an origin for the
data—its original source or provenance—and the movement and change of the data
as it passes through systems and is adopted for diﬀerent uses (the sequence of steps
within the data chain through which data has passed). Pushing the metaphor, we
can imagine that any data that changes as it moves through the data chain includes
some but not all characteristics of its previous states and that it will pick up other
characteristics through its evolution.

Data lineage is important to data quality measurement because lineage inﬂuences

expectations. A health care example can illustrate this concept. Medical claims sub-
mitted to insurance companies contain procedure codes that represent the actions
taken as part of a patient’s health care. These codes are highly standardized in hi-
erarchies that reference bodily systems. Medical providers (doctors, nurses, physical
therapists, and the like) choose which procedure codes accurately reflect the services
provided. In order to pay claims (a process called adjudication), sometimes codes
are bundled into sets. When this happens, different codes (representing sets) are
associated with the claims. This process means that specific values in these data fields
are changed as the claims are processed. Some changes are executed through rules
embedded in system programming. Others may be the result of manual intervention
from a claim processor. A person using such data with the expectation that the codes
on the adjudicated claims are the very same codes submitted by a doctor may be
surprised by discrepancies. The data would not meet a basic expectation. Without
an understanding of the data’s lineage, a person might reasonably conclude that
something is wrong with the data. If analysis requires codes as submitted and only
as submitted, then adjudicated claim data would not be the appropriate source for
that purpose.

> Read full chapter

Data Warehouses II
Charles D. Tupper, in Data Architecture, 2011

Standard or Corporate Business Language

On the integration project, like master data management or data lineage definition
or application and data consolidation, it is necessary to know what data you have,
where it is located, and how it is related between different application systems.
Software products exist today to move, profile, and cleanse the data. There are
also products that address discovery and the debugging of business rules and
transformation logic that mean they are different systems from one another.

If this is done manually, the data discovery process will require months of human
involvement to discover cross-system data relationships, derive transformation log-
ic, assess data consistency, and identify exceptions.

The data discovery products like Exeros and Sypherlink Harvester are software prod-
ucts that can mine both databases and applications to capture the data and metadata
to deﬁne the core of a common business language and store it for actionable activity.
It would take very little eﬀort to turn the result into a corporate dictionary.

It is critical after the compilation that the accumulated result be opened up to

all enterprise businesses to resolve and define data conflicts and definitional
issues. Even this can be done expeditiously with the use of a Wikipedia-type tool
that allows clarifications to be done in an open forum. This both accomplishes the
standardization of the language and resolves issues, while educating the corporation
as a whole.

> Read full chapter

Engagement
John Ladley, in Data Governance (Second Edition), 2020

Ramiﬁcation and beneﬁts

Understanding the data environment is critical for understanding how to get DG
operational. Most DG programs will eventually require some sort of data lineage
or provenance, that is, tracking where data starts, goes, is used, and who used it.
Very often a DG program will show immediate value when an understanding of the
data environment is presented to company risk management. You are assembling
the type of material that regulators will continue to request and insist on in greater
detail as time moves on.

If you do look into the cost of ownership of data, I can guarantee that the total
amount spent will be surprising, and there is a good chance that management will
tell you they do not believe the number. However, it is quite common for the total
cost of data to be four to ﬁve times higher than thought.

> Read full chapter

Data Integration Processes

Rick Sherman, in Business Intelligence Guidebook, 2015

Table and Row Updates

Data integration job audit data tracks the ﬂow of data through the BI data archi-
tecture at the grain of the table. It is a best practice to track row level audit data
to better manage it, enable data lineage analysis, and assist in improving system
performance. The template schema as depicted in Figure 12.16 enables this type of
audit data. The schema includes:

FIGURE 12.16. Data integration table—job audit columns.

• DI_Job_ID—the data integration job identifier is the job ID that the data
integration tool generated. This identifier is a foreign key to the data inte-
gration tool’s processing metadata. If that metadata is available, then this link•
provides a powerful mechanism to analyze data integration processing and
performance down to the level of a table’s row.
SOR_ID—This is the SOR identifier that will tie this row to a particular SOR. •
Use this when the table the row is sourced from multiple systems of records
and enables the row to be tied to the specific SOR.
DI_Create_Date—This is the date and time that this row was originally created•
in this table. Often a database trigger is used to insert the current time, but
the data integration job could also insert the current time directly.
DI_Modified_Date—This is the most recent date and time that this row has
been modified in this table. Often a database trigger is used to insert the
current time, but the data integration job could also insert the current time
directly. It is often a standard practice to populate this column with an initial
value far in the future such as “9999-12-31” rather than leaving it a NULL to
avoid queries with NULLs when analyzing this column.

> Read full chapter

Data Governance as an Operations

Process
Lowell Fryman, ... Dan Meers, in The Data and Analytics Playbook, 2017

Enterprise Architecture
Enterprise architecture is a broad topic, and we think most companies now realize
the importance of architecture as a foundation for success. A good architecture is
rarely noticed but a bad architecture can restrict ﬂexibility and numb the business.

It may seem obvious that enterprise architecture is important to data governance.

However, rarely is data governance represented in enterprise architecture discus-
sions or approvals. The technical architecture choice and the set of applications you
select to interact with business users greatly aﬀect the availability and quality of
data. Using a data governance lens, selecting an architecture is actually choosing
the best applications to manage your organization’s data and ensure it is available
to others—important variations on the core data governance jobs. If the data
governance-operating model is newly implemented, its data governance is probably
not integrated into enterprise architecture.

There are a few key areas where data governance and enterprise architecture need
to collaborate:
• Data sourcing

• Tracking data quality explicitly

• Creating control points to support data monitoring

• Data modeling/data architecture that supports data lineage processing

• Master and reference data management

• Business rule application for business rules that address data-quality issues

• Third-party contracting and data transparency

Integrating data governance processes into architecture will require changing es-
tablished architecture process. If enterprise architecture processes are not mature at
your organization, you may find that data governance needs to fit into ad-hoc and
potentially poorly understand and executed architecture processes. The specific way
you integrate data governance with architecture processes will vary:

• Ensure that data governance checklists are provided to the architecture group.

• Change the architecture approval process to ensure that sign-oﬀ by a data

governance leader is required.
• Ensure that there is a designated liaison to the architecture group.

• Provide architect training on data governance objectives and approaches.

• Change the organizational structure for the data governance group and colo-
cate the data governance and architecture resources into one overall group.
This approach has other ramiﬁcations and issues but it is an often-encountered
model in many companies. We generally recommend against this approach
because it can greatly impact the eﬀectiveness of the core data governance
jobs.

You may need to implement all of these approaches over time. Many companies
start with the easier process of assigning a data governance liaison to the archi-
tecture group then seeking to inﬂuence the approval and architecture development
process through checklists and approval changes. The Playbook does not dictate an
answer to data-sourcing questions, but it does provide insight into what context a
selected architecture must operate in to be consistent with data governance and the
operations process.

> Read full chapter

Architecting to Deliver Value From a
Big Data and Hybrid Cloud Architec-
ture
Mandy Chessell, ... Tim Vincent, in Software Architecture for Big Data and the
Cloud, 2017

3.12 Metadata and Governance

Metadata is descriptive data about data. In a data warehouse environment, the
metadata is typically limited to the structural schemas used to organize the data in
diﬀerent zones in the warehouse. For the more advanced environments, metadata
may also include data lineage and measured quality information of the systems
supplying data to the warehouse.

A big data environment is more dynamic than a data warehouse environment

and it is continuously pulling in data from a much greater pool of sources. It
quickly becomes impossible for the individuals running the big data environment to
remember the origin and content of all the data sets it contains. As a result, metadata
capture and management becomes a key part of the big data environment. Given the
volume, variety and velocity of the data, metadata management must be automated.
Similarly fulﬁlling governance requirements for data must also be automated as
much as possible.

Enabling this automation adds to the types of metadata that must be maintained
since governance is driven from the business context, not from the technical im-
plementation around the data. For example, the secrecy required for a company's
ﬁnancial reports is very high just before the results are reported. However, once
they have been released, they are public information. The technology used to store
the data has not changed. However, time has changed the business impact of
an unauthorized disclosure of the information, and thus the governance program
providing the data protection has to be aware of that context.

Similar examples from data quality management, lifecycle management and data
protection illustrate that the requirements that drive information governance come
from the business signiﬁcance of the data and how it is to be used. This means
the metadata must capture both the technical implementation of the data and the
business context of its creation and use so that governance requirements and actions
can be assigned appropriately.

Earlier on in this chapter, we introduced the concept of the managed data lake
where metadata and governance were a key part of ensuring a data lake remains a
useful resource rather than becoming a data swamp. This is a necessary first step in
getting the most value out of big data. However, from the different big data solutions
reviewed in this chapter, big data is not born in the data lake. It comes from other
systems and contexts. Metadata and governance needs to extend to these systems,
and be incorporated into the data flows and processing throughout the solution.

> Read full chapter

Data Warehousing and Online Analyti-

cal Processing
Jiawei Han, ... Jian Pei, in Data Mining (Third Edition), 2012

4.1.7 Metadata Repository

Metadata are data about data. When used in a data warehouse, metadata are the data
that define warehouse objects. Figure 4.1 showed a metadata repository within the
bottom tier of the data warehousing architecture. Metadata are created for the data
names and definitions of the given warehouse. Additional metadata are created
and captured for timestamping any extracted data, the source of the extracted data,
and missing fields that have been added by data cleaning or integration processes.

A metadata repository should contain the following:

A description of the data warehouse structure, which includes the warehouse

schema, view, dimensions, hierarchies, and derived data definitions, as well
as data mart locations and contents.
Operational metadata, which include data lineage (history of migrated data
and the sequence of transformations applied to it), currency of data (active,
archived, or purged), and monitoring information (warehouse usage statistics,
error reports, and audit trails).
The algorithms used for summarization, which include measure and dimension
definition algorithms, data on granularity, partitions, subject areas, aggrega-
tion, summarization, and predefined queries and reports.
Mapping from the operational environment to the data warehouse, which
includes source databases and their contents, gateway descriptions, data parti-
tions, data extraction, cleaning, transformation rules and defaults, data refresh
and purging rules, and security (user authorization and access control).
Data related to system performance, which include indices and profiles that
improve data access and retrieval performance, in addition to rules for the
timing and scheduling of refresh, update, and replication cycles.
Business metadata, which include business terms and definitions, data own-
ership information, and charging policies.

A data warehouse contains diﬀerent levels of summarization, of which metadata is

one. Other types include current detailed data (which are almost always on disk),
older detailed data (which are usually on tertiary storage), lightly summarized data,
and highly summarized data (which may or may not be physically housed).

Metadata play a very diﬀerent role than other data warehouse data and are important
for many reasons. For example, metadata are used as a directory to help the decision
support system analyst locate the contents of the data warehouse, and as a guide to
the data mapping when data are transformed from the operational environment to
the data warehouse environment. Metadata also serve as a guide to the algorithms
used for summarization between the current detailed data and the lightly summa-
rized data, and between the lightly summarized data and the highly summarized
data. Metadata should be stored and managed persistently (i.e., on disk).

> Read full chapter

Metadata Management
Mark Allen, Dalton Cervo, in Multi-Domain Master Data Management, 2015

Connecting the Business and Technical Tracks

The management of business and technical/operational metadata is quite differ-
ent. But obviously, there is a connection. Behind business rules and definitions
lie data elements, which exist in multiple systems throughout the enterprise. A
mapping can be created between business terms and their equivalent technical
counterparts. Through data lineage, it is possible to establish this relationship.
The result is astounding, as one can locate a given business term and trace it to
multiple applications, data sources, interfaces, reports, models, analytics, reports,
and other elements. It is the ultimate goal of metadata management: search for
a business term, learn and understand its definition, and track it throughout the
entire enterprise. Imagine how powerful this information is to data governance, data
quality, data stewards, and business and technical teams.

Figure 10.7 depicts a simplified data lineage to convey this idea. Notice the applica-
tion UI is being used as a connecting point. Business terms are mapped to labels on
the screens of multiple applications, which are mapped to databases, which in turn
can potentially be mapped to many other elements. This daisy-chain effect allows
any metadata object to be a starting point that can be navigated from wherever data
are flowing.

Figure 10.7. Simpliﬁed data lineage

> Read full chapter

Architecture and design

John Ladley, in Data Governance (Second Edition), 2020

Readiness for tools

Your strategy work may have indicated some sort of role for tools. Before you
proceed, you need to confirm your program will benefit from a tool, and you can
effectively operate that tool. Here are a few scenarios to guide your thinking:

1. Highly regulated industry—Data lineage and discovery will support compli-

ance. Obviously, metadata tools will document meaning. You still do not need
to go buy tools until you know what you operating model looks like, but it will2.
not be long before a tool will be most helpful.
Master data initiatives—A common, major data initiative is MDM. MDM flat3.
out will not be sustainable, and therefore wastes a LOT of money, without
DG. But supporting tools are not necessarily mandated until the DG activities
are underway. Usually the MDM vendor supplies some sort of metadata. The
useful metadata around MDM is often mapping old things to new. The master
data should clear up semantical differences across business functions, so the
need to manage common data definitions, standards, lineage, and reference
data makes mapping and glossary type products handy.
Advanced analytics/Big Data activity—This is an interesting area, as a lot of 4.
benefit can come out of a data science area without any DG oversight at all,
but only to a point.At some point a data scientist will say “we are getting slowed
down by data quality.” Or inconsistent definitions, etc. Quite often, the data
scientists, while quite expert on statistical methods, have no clue about data
management. I have had data scientists tell me that “there may be an issue
with data quality here. Have you heard of this?” At this point they want to write
their own tool, but data discovery and data quality tools and statistical model
management enters the discussion instead (hopefully).
Artificial Intelligence/Machine Learning—Probably the only area where I will
get keenly interested in tools well before other scenarios is AI. That is because
AI, depending on the application of course, can go very well, or horribly
wrong. And sometimes it is hard to tell the difference. Given distortions in
AI based on model bias, data quality, and the operationalizing of erroneous
models, AI often requires proactive data profiling, discovery, and significant
understanding of data lineage.

If you think you need DG technology, make sure you can actually implement and
support the tool. Even if you can identify with the above use cases, you also must
ensure that your organization is ready to use a DG tool, as readiness is a huge factor
in the decision-making process and the success of a DG program.

> Read full chapter

ScienceDirect is Elsevier’s leading information solution for researchers.

Doi&rig&pra&gui&et&med&4th PDF
86% (7)
Doi&rig&pra&gui&et&med&4th PDF
441 pages
Immediate download DAMA DMBOK 2nd Edition Data Management Body of Knowledge Dama International ebooks 2024
100% (2)
Immediate download DAMA DMBOK 2nd Edition Data Management Body of Knowledge Dama International ebooks 2024
52 pages
Get Multi-domain master data management: advanced MDM and data governance in practice Morgan Kaufmann Publishers. free all chapters
100% (2)
Get Multi-domain master data management: advanced MDM and data governance in practice Morgan Kaufmann Publishers. free all chapters
41 pages
B. Course Content - Day 1
100% (1)
B. Course Content - Day 1
64 pages
Databricks 101
No ratings yet
Databricks 101
16 pages
DMBOK2 Chapters 14 - 17
100% (1)
DMBOK2 Chapters 14 - 17
29 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
38 pages
Sas Data Governance Framework 107325
No ratings yet
Sas Data Governance Framework 107325
12 pages
CD SDA Data Governance Syllabus
No ratings yet
CD SDA Data Governance Syllabus
26 pages
The IBM Data Governance Unified Process: Driving Business Value with IBM Software and Best Practices
From Everand
The IBM Data Governance Unified Process: Driving Business Value with IBM Software and Best Practices
Sunil Soares
4/5 (1)
Data Goverance Stewarship
No ratings yet
Data Goverance Stewarship
8 pages
Data Governance
No ratings yet
Data Governance
25 pages
Case Studies of Open Source Data Quality Management
No ratings yet
Case Studies of Open Source Data Quality Management
64 pages
Data Quality - Trusted Data Across The Entreprise - Overview
100% (1)
Data Quality - Trusted Data Across The Entreprise - Overview
14 pages
Big Data Use Case Template 2
No ratings yet
Big Data Use Case Template 2
27 pages
Data Governance Maturity Model
No ratings yet
Data Governance Maturity Model
42 pages
The 5 Ways Modern Data Governance Helps Business Productivity
100% (1)
The 5 Ways Modern Data Governance Helps Business Productivity
12 pages
Big Educational Data & Analytics Survey
No ratings yet
Big Educational Data & Analytics Survey
23 pages
Data Cleaning
No ratings yet
Data Cleaning
8 pages
2.data Modeling Overview
No ratings yet
2.data Modeling Overview
36 pages
Data Management Chapter1
No ratings yet
Data Management Chapter1
11 pages
The Politics of Data Warehousing
No ratings yet
The Politics of Data Warehousing
9 pages
Best Practices For Implementing Cloud Data Governance and Catalog
100% (1)
Best Practices For Implementing Cloud Data Governance and Catalog
45 pages
Spark Use Cases
No ratings yet
Spark Use Cases
2 pages
Data Architecture Is Composed of Models
No ratings yet
Data Architecture Is Composed of Models
7 pages
Data Governance On Unity Catalog - Jul 2024
No ratings yet
Data Governance On Unity Catalog - Jul 2024
56 pages
The Data Driven Enterprise
No ratings yet
The Data Driven Enterprise
27 pages
WP How To Use The Dgi Data Governance Framework
100% (2)
WP How To Use The Dgi Data Governance Framework
17 pages
Data Governance
No ratings yet
Data Governance
29 pages
Insurance DataWare House Design Vechiles
No ratings yet
Insurance DataWare House Design Vechiles
2 pages
Data Quality Migration
No ratings yet
Data Quality Migration
4 pages
How To Scale Data Governance
100% (1)
How To Scale Data Governance
13 pages
4 - Finding and Fixing Data Quality Issues
No ratings yet
4 - Finding and Fixing Data Quality Issues
48 pages
Alation 1 Pager
No ratings yet
Alation 1 Pager
2 pages
Tdwi Best Practices Report Building The Unified Data Warehouse and Data Lake
No ratings yet
Tdwi Best Practices Report Building The Unified Data Warehouse and Data Lake
32 pages
Unit 1
No ratings yet
Unit 1
61 pages
Data Trends and Predictions 2022
No ratings yet
Data Trends and Predictions 2022
18 pages
BIGuidebook Templates - BI Logical Data Model - Preliminary Design
No ratings yet
BIGuidebook Templates - BI Logical Data Model - Preliminary Design
9 pages
What Is A Chief Data Officer
No ratings yet
What Is A Chief Data Officer
9 pages
Metadata Definitions - V01
100% (1)
Metadata Definitions - V01
12 pages
Ds Data Quality Business Intelligence
No ratings yet
Ds Data Quality Business Intelligence
2 pages
ETL vs. ELT: Frictionless Data Integration - Diyotta
No ratings yet
ETL vs. ELT: Frictionless Data Integration - Diyotta
3 pages
Preso Accenture - INFADAY - 2011
No ratings yet
Preso Accenture - INFADAY - 2011
18 pages
Data Strategy Feb 9 Part 2
No ratings yet
Data Strategy Feb 9 Part 2
36 pages
Ad3381 - Data Base Design and Management Manual
No ratings yet
Ad3381 - Data Base Design and Management Manual
56 pages
5 Steps To Build A Business Case For Continuous Data Quality Assurance
100% (1)
5 Steps To Build A Business Case For Continuous Data Quality Assurance
11 pages
Data Quality and Preprocessing Concepts ETL
No ratings yet
Data Quality and Preprocessing Concepts ETL
64 pages
Swetha Solipuram Resume
No ratings yet
Swetha Solipuram Resume
4 pages
Data Analytics Program - Introduction To Data Analytics - Lesson 1
No ratings yet
Data Analytics Program - Introduction To Data Analytics - Lesson 1
56 pages
Data Quality DMB Ok Dam A Brasil
100% (1)
Data Quality DMB Ok Dam A Brasil
46 pages
A Modern Approach To Test Data Management
No ratings yet
A Modern Approach To Test Data Management
9 pages
AnalytiX DS - Master Deck
No ratings yet
AnalytiX DS - Master Deck
56 pages
11 Best Practices For Data Engineers
No ratings yet
11 Best Practices For Data Engineers
7 pages
Cloud Data Warehouse
No ratings yet
Cloud Data Warehouse
7 pages
Idq 1
No ratings yet
Idq 1
13 pages
Identifying Master Data
100% (1)
Identifying Master Data
8 pages
Dmbok and CDMP
100% (1)
Dmbok and CDMP
40 pages
MDM Comprehensive Approach
No ratings yet
MDM Comprehensive Approach
6 pages
Data Dictionary Template
No ratings yet
Data Dictionary Template
10 pages
(Bandong) SAS Data Governance Framework
100% (1)
(Bandong) SAS Data Governance Framework
17 pages
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
Data Governance For Life Sciences: An IT Survival Guide: by Joe Correia - Sr. Consultant, Data Protection
No ratings yet
Data Governance For Life Sciences: An IT Survival Guide: by Joe Correia - Sr. Consultant, Data Protection
12 pages
DMM at A Glance
100% (1)
DMM at A Glance
60 pages
5 Qlikview Create Data Discovery Tool m5 Slides
No ratings yet
5 Qlikview Create Data Discovery Tool m5 Slides
13 pages
3 Qlikview Create Data Discovery Tool m3 Slides
No ratings yet
3 Qlikview Create Data Discovery Tool m3 Slides
6 pages
4 Qlikview Create Data Discovery Tool m4 Slides
No ratings yet
4 Qlikview Create Data Discovery Tool m4 Slides
13 pages
The New Information Fabric 3.0: Data Virtualization Delivered!
No ratings yet
The New Information Fabric 3.0: Data Virtualization Delivered!
35 pages
Understanding The Business Objective
No ratings yet
Understanding The Business Objective
32 pages
The Internal PM
No ratings yet
The Internal PM
3 pages
Fund Accounting
No ratings yet
Fund Accounting
1 page
AW119Kx - Mk-0321
No ratings yet
AW119Kx - Mk-0321
8 pages
Syllabus - GM3 - MM5001 - Business Ethic, Law and Sustainability
No ratings yet
Syllabus - GM3 - MM5001 - Business Ethic, Law and Sustainability
17 pages
Ranjit Pradhan
No ratings yet
Ranjit Pradhan
2 pages
Accept Bad Situation
No ratings yet
Accept Bad Situation
6 pages
Epals School Mail 101 Briefly
No ratings yet
Epals School Mail 101 Briefly
48 pages
Trinh A Case Solution
No ratings yet
Trinh A Case Solution
2 pages
Operational Excellence in The Process Industries
No ratings yet
Operational Excellence in The Process Industries
24 pages
Deped Cebu Profile Survey v6
No ratings yet
Deped Cebu Profile Survey v6
28 pages
2.2. Structural Analysis and Design
No ratings yet
2.2. Structural Analysis and Design
3 pages
Fork in The Road - Exercise: Establish The Context For Change
No ratings yet
Fork in The Road - Exercise: Establish The Context For Change
4 pages
Claudio J. Teehankee, Jr. vs. Hon. Job B. Madayag - G.R. No. 103102
No ratings yet
Claudio J. Teehankee, Jr. vs. Hon. Job B. Madayag - G.R. No. 103102
1 page
Hinal Songhela & Aakib R Hamdani
No ratings yet
Hinal Songhela & Aakib R Hamdani
9 pages
Prothom Alo
No ratings yet
Prothom Alo
4 pages
NAIT Graduate Placement Summary
No ratings yet
NAIT Graduate Placement Summary
5 pages
Headway4thEditionElementary - Vocabulary - Unit 2
No ratings yet
Headway4thEditionElementary - Vocabulary - Unit 2
2 pages
Blog MMS Logistics and Supply Chain Sector
No ratings yet
Blog MMS Logistics and Supply Chain Sector
6 pages
2nd Oee Slides
No ratings yet
2nd Oee Slides
8 pages
The Effectiveness of Leaflets and Posters As A Health Education Method
No ratings yet
The Effectiveness of Leaflets and Posters As A Health Education Method
5 pages
Chapter 2: Result Controls: Management Control Systems
No ratings yet
Chapter 2: Result Controls: Management Control Systems
96 pages
DLL
No ratings yet
DLL
4 pages
Midterm Partnership Reviewer
No ratings yet
Midterm Partnership Reviewer
3 pages
Presentation On Education System in FATA
No ratings yet
Presentation On Education System in FATA
14 pages
Hulaan Mo
No ratings yet
Hulaan Mo
6 pages
Semi-Final Notes
No ratings yet
Semi-Final Notes
6 pages
Etisalat Company United Arab Emirates: Marketing Presentation
0% (1)
Etisalat Company United Arab Emirates: Marketing Presentation
24 pages
Asma Nait Malek: Personal Statement
No ratings yet
Asma Nait Malek: Personal Statement
2 pages
ARAR Calibration and Testing
No ratings yet
ARAR Calibration and Testing
4 pages
Chapter3 PDF
No ratings yet
Chapter3 PDF
6 pages
Randstad Salary Guide Engineering Web PDF
No ratings yet
Randstad Salary Guide Engineering Web PDF
28 pages

Data Governance Book

Uploaded by

Data Governance Book

Uploaded by

Data Lineage

View all Topics

Data Quality Management

Data Lineage and Traceability

From data-quality and data-governance perspectives, it is important to understand

> Read full chapter

Data Lineage and Data Provenance

Data lineage is important to data quality measurement because lineage inﬂuences

> Read full chapter

Standard or Corporate Business Language

It is critical after the compilation that the accumulated result be opened up to

> Read full chapter

Ramiﬁcation and beneﬁts

> Read full chapter

Data Integration Processes

Table and Row Updates

FIGURE 12.16. Data integration table—job audit columns.

> Read full chapter

Data Governance as an Operations

It may seem obvious that enterprise architecture is important to data governance.

• Tracking data quality explicitly

• Creating control points to support data monitoring

• Data modeling/data architecture that supports data lineage processing

• Master and reference data management

• Third-party contracting and data transparency

• Change the architecture approval process to ensure that sign-oﬀ by a data

• Provide architect training on data governance objectives and approaches.

> Read full chapter

3.12 Metadata and Governance

A big data environment is more dynamic than a data warehouse environment

> Read full chapter

Data Warehousing and Online Analyti-

4.1.7 Metadata Repository

A metadata repository should contain the following:

A description of the data warehouse structure, which includes the warehouse

A data warehouse contains diﬀerent levels of summarization, of which metadata is

> Read full chapter

Connecting the Business and Technical Tracks

Figure 10.7. Simpliﬁed data lineage

> Read full chapter

Architecture and design

Readiness for tools

1. Highly regulated industry—Data lineage and discovery will support compli-

> Read full chapter

ScienceDirect is Elsevier’s leading information solution for researchers.

You might also like