Data Mapping Best Practices
Data Mapping Best Practices
The healthcare industry collects vast amounts of electronic data. These data are captured in a wide variety of
formats using various collection methods. Using and reusing health data for multiple purposes can
maximize efficiency, minimize discrepancies and errors caused by multiple data entry processes, reduce
costs of data acquisition and storage, and support health information exchange and interoperability. When
data are collected in a specific format or coding system and the same or similar information is needed for a
different purpose, data maps from one system to another facilitate the reuse of data.
In order for the data to be useful for all their intended purposes, semantic interoperability is required to
achieve meaningful exchange across settings, data sets, and standards. Maps are one approach organizations
are considering to achieve this goal.
The International Organization for Standardization's preferred definition of mapping is "the process of
associating concepts or terms from one coding system to concepts or terms in another coding system and
defining their equivalence in accordance with a documented rationale and a given purpose."1 The term
"coding system" is used to depict encoded data, generically including clinical terminologies, administrative
codes, vocabularies, classification systems, and any type of schema used to represent data in health
information systems.
This practice brief defines key data mapping concepts and outlines best practices related to the development
and use of data maps.
ICD-9-CM Name
Equivalence
Snomed CT
Code
Snomed CT
Name
001.0
Equal
63650001
Cholera
Code 001.0 has a single relationship to a single SNOMED CT concept. The SNOMED CT concept
"Cholera" (63650001) is clinically equivalent as it contains the attribute "causative agent" as Vibrio
cholerae.
Single Mapping between ICD-9-CM and SNOMED CT with Related Concepts in Both Systems
ICD-9-CM
Code
ICD-9-CM Name
Equivalence
Snomed CT
Code
Snomed CT Name
282.7
Other
hemoglobinopathies
Related
80141007
Hemoglobinopathy
The concepts in each system vary, creating an approximation. In this case the closest concept in
SNOMED CT is hemoglobinopathy since the type is not identifiable in the ICD-9-CM classification.
1
Map Based on the Rules and Guidelines of the Source and Target Systems
ICD-9-CM
Code
ICD-9-CM Name
Equivalence
Snomed CT
Code
Snomed CT
Name
774.6
Fetal/Neonatal jaundice
(NNJ)
387712008
Neonatal
jaundice
Code 774.6, Unspecified fetal and neonatal jaundice, specifically excludes jaundice in preterm infants,
which has its own code of 774.2, Neonatal jaundice associated with preterm delivery. The SNOMED
CT concept of "Neonatal jaundice associated with preterm delivery" (73749009) is therefore excluded
as a potential mapping target.
One-to-Many Mappings between ICD-9-CM and SNOMED CT
ICD-9-CM
Code
ICD-9-CM
Name
070.23
Viral hepatitis B
with
hepatic coma,
Related
chronic,
with hepatitis
delta
Equivalence
Snomed CT
Code
235869004
26206000
Snomed CT Name
Chronic viral hepatitis B
with hepatitis D
Viral hepatitis B with
hepatic coma
Equivalence
Snomed CT
Snomed CT Name
Code
008.3
Proteus
(Mirabilis/Morganii)
enteritis
Related
30493003
008.3
Proteus
(Mirabilis/Morganii)
enteritis
Related
36529003
Code 008.3 has relationships to two SNOMED CT concepts. Either of the SNOMED CT codes could be
properly linked to the classification. In this case it is important to consider how the resulting map affects
its use.
ICD-9-CM and Local Code Mappings to a Single SNOMED CT Target
ICD-9-CM
Code
ICD-9-CM Name
Equivalence
Snomed CT
Code
Snomed CT
Name
002.9
Paratyphoid fever
Equal
85904008
Paratyphoid fever
002.9ZZ
Paratyphoid fever
(severe)
Related
85904008
Paratyphoid fever
In this case 0029.ZZ is a local code mapped to a related SNOMED CT code as SNOMED CT concepts
do not capture the severity of the condition.
Mapping Relationships
A map should describe how the source and target are related. Degree of equivalency, match rating, and a
rating scale using numbers with designated values are used to describe the relationship between the source
and the target.
2
Equivalence describes the relationship between the source and target and informs users how close or distant
the two systems are. A map's degree of equivalence affects its utility and reliability.
While one source code may map to one target code, the two codes may not have the exact same meaning.
This is especially true when mapping a terminology to a classification. Map developers must identify the
degree of equivalence for each map and document how it was determined. When using maps for clinical
care the designation of equivalence is a critical element so all ambiguity of the closeness of the match is
eliminated.
From a data integrity perspective it is equally important for the statement of equivalence to be easily
understood and consistently applied by all map users. Because the map developer generates this
documentation, the exact terms may vary; however, the general concept is the same. Common terms
include:
No match, no map, no code
Approximate match, approximate map, related match
Exact match, exact map, equivalent match, equivalent map, equal
The International Organization for Standardization's technical report illustrates a 1 to 5 rating scale to
determine the degree of equivalence.3 For example, "no match or map" means a concept exists in one of the
coding systems without a similar concept in the other system. In the rating scale, "1" represents equivalent
meaning, while "5" indicates that no map is possible between the source and target.
Other designations associated with mapping identify how many concepts in each system are necessary to
achieve the closest approximation as possible. This may be referred to as "cardinality of the map."
Frequently used terms to describe the elements of the map set include:
One to one: one concept is mapped for both the source and target system
One to many: one concept in the source is mapped to multiple concepts in the target
Many to one: multiple concepts in the source are mapped to one in the target
Many to many: multiple concepts are mapped in both the source and target system
For instance, one to one expresses the degree of equivalency between two systems, indicating a single
concept in the target system has a relationship to one concept in the source system.
"Mapping Equivalence Examples" above provides equivalence examples for data maps. The first example
illustrates a map from ICD-9-CM (a classification system) to SNOMED CT (a clinical terminology) to
support the conversion of information captured in ICD to populate an electronic health record system using
SNOMED CT. It indicates the degree of equivalency as "equal" clearly between the source and target
system. The subsequent examples illustrate other types of equivalency.
Identifying the degree of equivalency within a map facilitates proper use of data map types for the intended
result.
Description
CPT Code
LOINC Code
Charge
123456789
83036
55454-3
$15
Map Types
There are three general types of maps:
Standards development organization maps. These maps are either created or adopted by a standards
development organization or in cooperation between organizations. An example of this type of
cooperative project is the current harmonization agreement between the International Health
Terminology Standards Development Organisation and the World Health Organisation. These
organizations are working together to ensure maps between SNOMED CT and ICD-10 are properly
developed and SNOMED CT and the classification codes in the 10th revision of ICD-10 are linked
where possible.
1002-5
2028-9
Asian
2054-5
2076-8
2106-3
White
2131-1
Other Race
R1
R2
Asian
R3
Black/African American
R4
R5
White
R9
Other Race
UNKNOW
Unknown/Not Specified
Source: United States Health Information Knowledgebase: Division of Healthcare Finance and
Policy
A Data Map to Support Administrative Functions That Use the Current HL7 Standards
This map demonstrates how local data concepts may be mapped to a standard. This is another
form of a proprietary map.
Local Data value local Meaning Name Mapped Result Closest to the Standard
NATAM
Native American
eski
Eskimo
negro
Negro (Black)
poly
Polynesian
cauc
Caucasian
White
MULTI
Mixed Race
No Map
This example illustrates the necessity of guidelines or rules for maps. Guidelines or rules are
needed to build a useful map to the standard. In this instance the guidance is provided by the
United States Office of Management and Budget.
The graphic below demonstrates the importance of updating maps. Prior to October 2005, ICD-9-CM
code 799.0 was mapped to SNOMED concept 70067009.
In October 2005 the valid ICD-9-CM code was expanded to five digits, 799.01. An existing rule of the
map required that all valid billable codes must have a map to SNOMED CT, and therefore new ICD-9CM code 799.01 was mapped to 70067009.
In November 2010, SNOMED CT retired the target concept (70067009), and ICD-9-CM code 799.01
was re-mapped to 66466001.
Mapping Update
Prior to October
Asphyxia (799.0)
2005
October 2005
Asphyxia (799.0)
Asphyxia and hypoxemia
(799.0)
Asphyxia (799.01)
ICD-9 to SNOMED
mapping
Asphyxia (70067009)
ICD-9 to SNOMED
mapping
ICD-9 to SNOMED
mapping
Asphyxia (70067009)
ICD-9 to SNOMED
mapping
ICD-9 to SNOMED
mapping
Asphyxia (70067009)
Asphyxiation (66466001)
Decision to Map
After evaluating existing maps, an organization may decide to create its own customized map. If it does, it
must first clearly define the business use case of the map.
The use case is a scenario describing how intended users will interact with the map, and it should include
"the 'actors,' priorities, pre- and post-conditions (including input and output), flow of events, user interface
issues, and more."4 Examples of different use cases include a map for clinical decision support or a map to
support billing and reimbursement.
The use case may address issues such as:
What problem is the map trying to solve
How the organization will create, use, and maintain the map
What cost center pays for the map
Organizations that choose to create their own data maps must also carefully consider the resources necessary
for maintenance and updating once the initial mapping has been completed. Various systems update on
asynchronous cycles, and the effort required to maintain maps is often underestimated. For example, ICD-9CM has the potential to update twice a year, in April and October; CPT may update twice a year, also.
However, terminologies such as LOINC, RxNorm, and SNOMED CT have their own update schedules.
The sidebar at left demonstrates the importance of maintaining maps. It illustrates how routine updates to
ICD-9-CM and SNOMED CT over the years required changes within the map.
Keeping a map current can prove to be unwieldy and expensive in times of lean business practices. An
organization may choose not to create or use maps due to time, resource, or financial constraints.
However, when an organization decides to develop or implement a map, there are fundamental steps that
must be taken to ensure reliable, expected results.
Define a use case for how the content will be used within applications. Questions to ask include:
Who will use the maps?
Is the mapping between standard terminologies or between proprietary (local) terminologies?
Are there delivery constraints or licensing issues?
What systems will rely on the map as a data source?
Develop rules (heuristics) to be implemented within the project. Questions to ask when developing the
rules include:
What is the version of source and target schema to be used?
What is included or excluded?
How will the relationship between source and target be defined (e.g., are maps equivalent, related,
etc.)?
What procedures will be used for ensuring intercoder/inter-rater reliability (reproducibility) in the
map development phase?
What parameters will be used to ensure usefulness? (For example, a map from the SNOMED CT
concept "procedure on head" could be mapped to hundreds of CPT codes, making the map virtually
useless.)
What tools will be used to develop and maintain the map?
Plan a pilot phase to test the rules. Maps must be tested and deemed "fit for purpose," meaning they are
performing as desired. This may be done using random samples of statistically significant size. Additional
pilot phases may be needed until variance from the expected result are resolved. Reproducibility is a
fundamental best practice when mapping.
Develop full content with periodic testing throughout the process. Organizations should perform a final
quality assurance test for the maps and review those data items unable to be mapped to complete the
mapping phase.
Organizations should release the map results to software configuration management where software and
content are integrated. They should then perform quality assurance testing on the content within the
software application (done in a development environment). They can then deploy the content to the
production environment, or go-live.
Communicate with source and target system owners when issues are identified with the systems that
require attention or additional documentation for clarity.
Whether an organization decides to create a map or chooses to use a map created outside of the
organization, maps should be validated by an objective, qualified third party.
Map Validation
Organizations should follow the understandable, reproducible, and useable principle to develop appropriate
data maps. This principle stipulates that:
The links between data elements should be understood by the user without benefit of a user guide or
100-page manual.
The process to develop the data links must be straightforward enough to be reproduced so that the
same results occur no matter who (human) or what (machine or software program) is creating the
links.
The map is not valuable if it is not useable for the use case it was designed to support.
The validation of maps must be conducted by an entity that has not been involved with the map
development or has a financial or political interest in its use.
Just as there are steps to create a map, there are best practice steps for validation. Organizations should
begin the validation by asking these characteristics questions of the map:
Is the map easy to understand?
Can the results be reproduced?
Is the use case support evident in the map results?
Organizations should then examine and compare the use case or the purpose of the map for consistency. A
map designed for a specific purpose may not be suitable for a different purpose, so a comparison with the
stated use case using a proper sample of map records is necessary.
Organizations should also complete additional general validation reviews, including:
Using authoritative sources from the standards development organization to link the codes when the
map is between two official standards (e.g., SNOMED CT to ICD or LOINC to CPT)
Drawing a statistically valid sample from the map record population to review the validity of the
map results
Reviewing map heuristics
Enlisting a qualified person without access to the previous work to perform the mapping
independently (blind comparison)
Comparing results and explaining any discordance in detail in a full report
Organizations should perform an "in use" review and validation check, including:
Comparing map performance in translating the source to the target for its intended purpose (e.g.,
does it produce the same results every time the map record is used?)
Assessing the concordance between software programs using the map and noting any discordance
Presenting a full report of findings with a qualified panel from the organization using the maps
Vendor-Developed Maps
It is now common for terminology developers and distributors to develop their own mappings or to extend
standards development organization or government maps in what the vendor terms a "value add." EHR
vendors may also develop and incorporate maps into their products. Before an organization uses a vendordeveloped map, it must evaluate the map according to the mapping principles outlined above, just as it
would any nonauthoritative map.
In addition, an organization's due diligence should include requesting references from organizations
currently using the map, as well as knowing who and how many other organizations use the map. Vendordeveloped maps can help enormously with map implementation and use if the organization understands the
product and assesses it with the framework of their needs and requirements.
There are no standards for health data maps, nor is there a certification program for health information or
health data maps such as exists for many other technology standards. Thus, it is up to users to ensure that the
maps are appropriate for their purposes and meet their needs.