0% found this document useful (0 votes)
2 views3 pages

Ascertain Accuracy of Data (1)

This document outlines the importance of data quality dimensions in assessing and improving the quality of data within organizations. It defines six core data quality dimensions—completeness, uniqueness, timeliness, validity, accuracy, and consistency—and provides best practices for their application. The goal is to establish a common understanding among data quality practitioners and business stakeholders to enhance decision-making and reduce confusion in data management.

Uploaded by

Genet Assefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

Ascertain Accuracy of Data (1)

This document outlines the importance of data quality dimensions in assessing and improving the quality of data within organizations. It defines six core data quality dimensions—completeness, uniqueness, timeliness, validity, accuracy, and consistency—and provides best practices for their application. The goal is to establish a common understanding among data quality practitioners and business stakeholders to enhance decision-making and reduce confusion in data management.

Uploaded by

Genet Assefa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Ascertain Accuracy of Data

Data is a set of values of qualitative or quantitative variables which do not have complete
meaning that need to be farther analysis.

Data and information are often used interchangeably; however, the extent to which a set of data
is informative to someone depends on the extent to which it is unexpected by that person. The
amount of information content in a data stream may be characterized by its usage and
application.

Data quality dimension


The term data quality dimension has been widely used for a number of years to describe the
measure of the quality of data. However, even amongst data quality professionals the key data
quality dimensions are not universally agreed. This state of affairs has led to much confusion
within the data quality community and is even more bewildering for those who are new to the
discipline and more importantly to business stakeholders. Socrates said, “The beginning of
wisdom is the definition of terms”. Hence, the goal of this whitepaper is to define the key data
quality dimensions and provide context so there can be a common understanding for industry
professionals and business stakeholders alike

WHAT IS A DATA QUALITY DIMENSION? A Data Quality (DQ)

Dimension is a recognised term used by data management professionals to describe a feature* of


data that can be measured or assessed against defined standards in order to determine the quality
of data. For example:

• A test data set is measured as 93% complete

• The result of an accuracy assessment for a data item in a test data set was 84% A DQ
Dimension is different to, and should not be confused with other dimension terminologies such
as those used in:

• other aspects of data management e.g. a data warehouse dimension or a data cube dimension

• physics, where a dimension refers to the structure of space or how material objects are located
in time * Characteristic, attribute or facet

CONTEXT

The best practice laid out in this document is designed to assist data quality practitioners when
looking to assess and describe the quality of the data in their organisations. This document
defines the six best practice definitions as generic data quality dimensions. This will help to
reduce uncertainty and confusion that may arise when considering data quality. It is suggested
that these dimensions and definitions should be adopted by data quality practitioners as the
standard method for assessing and describing the quality of data. However, in some situations
one or more dimension may not be relevant. The intention is for organisations to use these
dimensions to measure the impact of the poor data quality in terms of cost, reputation and
regulatory compliance, etc.

Before attempting to use data quality dimensions, an organisation needs to agree the quality rules
against which the data needs to be assessed against. These rules should be developed based upon
the six data quality dimensions, organisational requirements for data and the impact on an
organisation of data not complying with these rules.

Examples of organisational impacts could include:

• incorrect or missing email addresses would have a significant impact on any marketing
campaigns

• inaccurate personal details may lead to missed sales opportunities or a rise in customer
complaints

• goods can get shipped to the wrong locations

• incorrect product measurements can lead to significant transportation issues i.e. the product
will not fit into a lorry, alternatively too many lorries may have been ordered for the size of the
actual load

Data generally only has value when it supports a business process or organisational decision
making. The agreed data quality rules should take account of the value that data can provide to
an organization . If it is identified that data has a very high value in a certain context, then this
may indicate that more rigorous data quality rules are required in this context.

HOW TO USE DATA QUALITY DIMENSIONS

Organisations select the data quality dimensions and associated dimension thresholds based on
their business context, requirements, levels of risk etc. Note that each dimension is likely to have
a different weighting and in order to obtain an accurate measure of the quality of data, the
organisation will need to determine how much each dimension contributes to the data quality as
a whole.

A typical Data Quality Assessment approach might be:

1. Identify which data items need to be assessed for data quality, typically this will be data items
deemed as critical to business operations and associated management reporting

2. Assess which data quality dimensions to use and their associated weighting

3. For each data quality dimension, define values or ranges representing good and bad quality
data. Please note, that as a data set may support multiple requirements, a number of different data
quality assessments may need to be performed

4. Apply the assessment criteria to the data items


5. Review the results and determine if data quality is acceptable or not

6. Where appropriate take corrective actions e.g. clean the data and improve data handling
processes to prevent future recurrences

7. Repeat the above on a periodic basis to monitor trends in Data Quality

The outputs of different data quality checks may be required in order to determine how well the
data supports a particular business need. Data quality checks will not provide an effective
assessment of fitness for purpose if a particular business need is not adequately reflected in data
quality rules. Similarly, when undertaking repeat data quality assessments, you should check to
determine whether business data requirements have changed since the last assessment.

Whilst most data quality dimensions can be assessed by analysing the data itself, assessing
accuracy of data can only be achieved by either:

• Assessing the data against the actual thing it represents, for example, when an employee visits a
property; or

• Assessing the data against an authoritative reference data set, for example, checking customer
details against the official list of voters

SIX CORE DATA QUALITY DIMENSIONS

The six core dimensions of data quality are:

1. Completeness

2. Uniqueness

3. Timeliness

4. Validity

5. Accuracy

` 6. Consistency

You might also like