MIT-Data & Information
MIT-Data & Information
Transaction Data
Transactional data
describe an internal
or external event
that takes place as
an organization
conducts its
business.
Master Data
Master data
describes the
people, Product,
Location, Account
those are Assets,
slow changing and
involved in an
organizations
business
Reference Data
ODS
ODS is designed to
integrate disparate
data from multiple
sources so that
business
operations,
analysis and
reporting can be
carried out.
Metadata
Metadata literally means data about data. Metadata describe, or characterize other data and
make it easier to retrieve, interpret, or use information.
Unstructured Data
3
Master Data
Reference Data
ODS Data
Examples include
sales orders,
invoices, purchase
orders, shipping
documents,
passport
applications, credit
card payments, and
insurance claims.
Examples include
people (e.g.,
customers, employees,
vendors, suppliers),
places (e.g., locations,
sales territories,
offices), and things
(e.g., accounts,
products, assets,
document sets).
Metadata
Technical metadata are metadata used to describe technology and data structures. Examples of technical metadata are field
names, length, type, lineage, and database table layouts.
Business metadata describe the nontechnical aspects of data and their usage. Examples are field definitions, report names,
headings in reports and on Web pages, application screen names.
Audit trail metadata are a specific type of metadata, typically stored in a record and protected from alteration, that capture
how, when, and by whom the data were created, accessed, updated, or deleted. Audit trail metadata are used for security,
compliance, or forensic purposes. Examples include timestamp, creator, create date, and update date.
Unstructured Data
4
Transactional
Systems
Operational
Data Stores
(ODS)
Data
Warehouse
(DWH)
Data Marts
(DM)
Master
Master
Transaction
Master
Transaction
Master
Transaction
Analytic
Master
Transaction
Analytic
History
Yes
No
Limited
Yes
Yes
Integration
Yes
No
Limited
Yes
Yes
Real Time
Close to Real
time
Sometimes Daily
Weekly
Monthly
Sometimes
Daily, Weekly,
Monthly
Fully Integrated
Application
Neutral
Analytical,
Derived and
Summarized,
Application
specific
Limited to
Derived Data
Limited to
Derived Data
Data Class
Data
Currency
Real Time
Data Scope
Fully
Integrated
Application
Neutral
Local
Application
Specific
Integrated
(Limited to a
few
applications)
Yes
Yes
No
Data
Creation
Value of Information
Information Value
Operational
Applications
OLAP
Descriptive
Modeling
Predictive
Modeling
Optimization
Information Sophistication
OLAP: Online analytical processing
Knowledge:
is understanding, awareness and
recognition of a situation and
familiarity with its complexity. It
may also include assumptions
and theories about causes.
Knowledge
+ Patterns & Trends
+ Relationships
+ Assumptions
Information
+ Definition
+ Format
+ Timeframe
+ Relevance
Data
Important Events
General ledger
Budgeting
Human Resource Management and Payroll
Customer Relationship Management
Forecasting, Materials Management, Production Planning
Supply Chain Management
Order Processing, Purchase and Inventory Management
Logistics, Distribution, Fulfillment
Training
10
11
Dimension
Description
Accuracy
Integrity
Consistency
Completeness
Uniqueness
Accessibility
Precision
Timeliness
Data Quality Management involves more than just addressing historical data quality issues through
data profiling and re-engineering. It involves preventing these issues from occurring in the first place.
12
Description
Relevance
Usability
Usefulness
Believability
Unambiguous
Each piece of data has a unique meaning, and can be easily comprehended
Objectivity
Data is objective, unbiased and impartial i.e., it does not depend on the
judgment, interpretation, or evaluation of people
Data Quality Management involves more than just addressing historical data quality issues through
data profiling and re-engineering. It involves preventing these issues from occurring in the first place.
13
Data Movement
Data Exchange
Data Distribution
Data Governance
Reports
Enterprise Data Warehouse /
Data Marts
Transaction Data Repository
14
A hierarchical structure
Major department heads report to a president or top-level manager
Managerial pyramid shows the hierarchy of decision making and authority
15
16
17
19
Strategic Management
The People
Board of Directors
Chief Executive Officer
President
Decisions
Develop Overall Goals
Long-term Planning
Determine Direction
Political
Economic
Competitive
20
Tactical Management
People
o Business Unit Managers
o Vice-President to MiddleManager
Decisions
o
o
o
o
o
o
21
Operational Management
People
o Middle-Managers to
o Supervisors
o Self-directed teams
Decisions
o
o
o
o
o
o
short-range planning
production schedules
day-to-day decisions
use of resources
enforce polices
follow procedures
22
23
24
25
Analytical models
Specialized databases
A decision makers own insights and judgments
An interactive, computer-based modeling process
26
27
28
o
o
o
o
o
o
Decision Quality
Improved Communication
Cost Reduction
Increased Productivity
Time Savings
Improved Customer And Employee Satisfaction
29
TPS
DSS
Operational
Management
Decisions
MIS
Tactical
Management
Decisions
Strategic
Management
Decisions
EIS
30
Information Characteristics
Decision Structure
Unstructured
Semi-structured
Strategic
Management
Ad Hoc
Unscheduled
Summarized
Infrequent
Forward Looking
External
Wide Scope
Tactical Management
Structured
Operational Management
Pre-specified
Scheduled
Detailed
Frequent
Historical
Internal
Narrow Focus
31
32
Data needs to be managed as an Enterprise Asset, rather than as an asset of any one group or
department
Rationale
Data is a valuable asset providing reference and operational information in support of making
business decisions, transactions between departments, divisions, as well as with trading
partners and customers.
Effective management and operation of the organization and successful provision of goods
and services depend on accurate, timely and secure data.
Data usually originates in one business process and/or department, but it may be used by
other business processes and departments, as well as beyond the organizational boundary
into the extended organization, including trading partners.
Lack of a common, Enterprise wide approach to Data results in separate islands of data,
created in different departments and managed according to varying and disparate policies and
principles.
33
Data collected and stored must be readily available to all (i.e. shared with) authorized users,
whenever and wherever needed
Rationale
Historically, data has been seen to belong within specific application and within organizational
boundaries. This created an artificial notion of our data - data in fact transcends the
applications and the operational, and must be treated as an Enterprise Asset rather than
belonging to a specific application or user group.
Creating mechanisms which enable easy and consistent sharing of data between applications
and user groups will reduce application integration complexity.
34
There will be a single, clearly identified, data storage location for each managed Enterprise
Data element.
Rationale
A single data source is critical to achieving comprehensive data integrity.
A single source reduces confusion, complexity, and therefore data management costs.
A single data source enables a more effective data ownership practice
35
4 - Replicate data
Definition
The organization will provide replicated data to enhance access performance and/or
specialized analysis needs. However every data item must have one and only one source of
reference, which is singularly used as the source for all replications. The replicated data should
not be persisted beyond the original usage time-window.
Rationale
Business value, access performance demands, or specialized analysis needs may require data
replications.
System response performance impacts system usability and therefore complex reporting and
data analyses should be performed on replicated data and not on the source data.
As the organization opens up its data and information repositories to a broader audience via
new channels (e.g., the Internet) the organization must ensure its original source data is
protected from accidental corruption.
36
Data must be validated at the source (i.e., at data entry or collection), prior to being recorded
in the reference source repository
Rationale
Validation Rules prevent entry and propagation of invalid data, which in turn can lead to
adverse business decisions, or erroneous business transactions.
Validating data at the source is critical to improve and maintain data integrity.
37
Each data item must have a clearly defined Owner who is accountable and responsible for the
quality of the data, including its definition.
Rationale
Clearly defined Ownership provides clarity in data definition, reducing unnecessary
complexity.
Data Ownership ensures responsibility and accountability for the maintenance of data integrity
and protection of strategic Enterprise Assets.
Clear data ownership ensures accountability for the quality, accuracy and timeliness of data for
the source.
Data ownership identifies the single point of authority for the specification of the quality
requirements for each data element.
38
Automated data entry techniques and processes will be used to minimize data entry errors
where economically and technically viable.
Rationale
Manual data entry is inherently prone to error.
Retail industry innovations such as bar coding, Radio Frequency Identification (RFID), Point Of
Sale (POS) scanning facilitates automated data entry.
Data accuracy is critical to support decision making in the retail industry.
Costs of manual data entry increase as the costs of human resources increase.
39
Rationale
Changing business strategies, business processes, service delivery and data access channels
are predicted to lead to significant increases in data storage capacity
Early identification of changing data storage requirements facilitates capacity meeting demand
in a planned and cost effective manner
40
Definition
Rationale
Breaches of data security increase exposure to potential financial loss.
41
Rationale
Data must be managed as a critical strategic asset and access to data must be managed according to
the Corporate Data Management Policies, as defined by the Information Architecture.
Regular breaches to the Data Management Policies, regarding data access, result in unsecured
sharing of data or excessive control of access to data. Both of these extremes are undesirable.
The organization serves a variety of user groups, with differing requirements. Accordingly, the
Corporate Data Management Policies, must be flexible enough to provide access to corporate
information whilst also securing it.
Sensitive and/or classified data often needs to be protected and access to it restricted, according to
the sensitivity.
Different data elements have different levels of sensitivity and therefore access is allowed at
different levels.
42
Rationale
This is necessary to minimize sources of data error
Error-tracking-back-to-the-source approach is critical for a successful data integrity program.
The approach will assist to minimize the cost of the data maintenance over time.
43
Rationale
Integration must be accomplished across the complete data value chain which includes all trading
partners.
Externally sourced, competitive information has value to the organization and therefore should also
be managed
44
Rationale
Repeated data entry is costly.
Event driven, single data creation enables optimal reuse of data and information.
Repeated data entry causes data integrity problems
45
Rationale
The life cycle of data item/subject area varies in time and therefore the organization must be
cognizant of the varying data item/subject area life cycles and the current life cycle stage each data
item/subject area is in.
46