0% found this document useful (0 votes)
232 views

Data Classification & Sensitivity Label Taxonomy

Uploaded by

johnooicp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
232 views

Data Classification & Sensitivity Label Taxonomy

Uploaded by

johnooicp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Data Classification &

Sensitivity Label Taxonomy


Published: March 2020

©2020 Microsoft Corporation. All Rights Reserved.


Contents
Abstract...................................................................................................................................... 3
Introduction ................................................................................................................................ 3
What is Data Classification?....................................................................................................... 3
What is a Data Classification Framework? ................................................................................. 3
Pain Points in Creating a Data Classification Framework ........................................................... 5
Creating a Well-Designed Data Classification Framework ......................................................... 6
Implementing Your Data Classification Framework in Microsoft 365 .......................................... 8
Mapping Data Classification Levels to Microsoft 365 Sensitivity Labels ..................................... 9
Determining how labels will be applied to content .....................................................................10
Change Management and Training ...........................................................................................11
Governance and Maintenance ..................................................................................................12
Industry Considerations ............................................................................................................13
See also: ...................................................................................................................................13
Checklist for Success: ...............................................................................................................14

2
Abstract
This white paper is designed to help business leaders guide their organization through the
process of creating or updating their Data Classification framework in the context of online
services, including how to secure data in Microsoft 365 using Sensitivity Labels. Best practices,
common pain points, and industry considerations are discussed, as well as links to helpful
technical content that will be useful during implementation.

Introduction
Sensitive data presents significant risk to a company if it is stolen, inadvertently shared, or
exposed through a breach. Risk factors include reputational damage, financial impact, and loss
of competitive advantage. Undoubtedly, protecting the data and information your business
manages is a top priority for your organization, but you may find it difficult to know if your
current efforts are truly effective, given the sheer amount of content held by your enterprise.

In addition to volume, your content ranges in importance from highly sensitive and impactful to
trivial and transient, and it can be under the purview of various regulatory compliance
requirements. Knowing what to prioritize and where to apply controls can be a challenge. Read
on to learn about Data Classification, an important tool at your disposal for protecting your
content from theft, sabotage, or inadvertent destruction, and how Microsoft 365 can help
translate your information security goals into reality.

What is Data Classification?


Data Classification is a specialized term used in the fields of cybersecurity and information
governance to describe the process of identifying, categorizing, and protecting content
according to its sensitivity or impact level. In its most basic form, data classification is a means of
protecting your data from unauthorized disclosure, alteration, or destruction based on how
sensitive or impactful it is.

What is a Data Classification Framework?


Often codified in a formal, enterprise-wide policy, a Data Classification framework is typically
comprised of 3-5 classification levels, each of which usually include three elements: a name,
description, and real-world examples. Microsoft recommends no more than 5 top-level parent
labels, each with 5 sub-labels (25 total) to keep the User Interface (UI) manageable. Levels are
typically arranged from least to most sensitive such as Public, Internal, Confidential, and Highly

3
Confidential. Other level name variations you may encounter include Restricted, Unrestricted,
and Consumer Protected. Microsoft recommends label names that are self-descriptive and that
highlight their relative sensitivity clearly. For instance, Confidential and Restricted may leave
users guessing which is appropriate, while Confidential and Highly Confidential are more clear
on which is more sensitive.

Example Data Classification Framework Level

Classification Level Description Examples

Highly Confidential Highly Confidential data is the most ● Sensitive Personally


sensitive type of data stored or Identifiable
managed by the enterprise and may Information (Sensitive
require legal notifications if breached PII)
or otherwise disclosed. ● Cardholder Data
● Protected Health
Restricted Data requires the highest Information (PHI)
level of control and security, and ● Bank Account Data
access should be limited to "need-to-
know."

Lesson learned: Microsoft’s corporate data classification framework originally used a category
and label named ‘Internal’ during pilot phase but found that there were legitimate reasons for a
document to be shared externally and shifted to using ‘General’.

Another important component of a Data Classification framework is the controls associated with
each level. It is important to note that Data Classification levels by themselves are simply labels
(or tags) that indicate the value or sensitivity of the content. In order to actually protect that
content, Data Classification frameworks define the controls that should be in place for each of
your data classification levels. These controls may include requirements related to:

● Storage Type and Location


● Encryption
● Access Control
● Data Destruction
● Data Loss Prevention
● Public Disclosure
● Logging and Tracking Access
● Other control objectives, as needed

4
Your security controls will vary by data classification level, such that the protective measures
defined in your framework increase commensurate with the sensitivity of your content. For
example, your data storage control requirements will vary depending upon the media that is
being used as well as upon the classification level applied to a given piece of content.

Example of data classification controls for a specific storage type

Data Classification Level


Storage Type
Confidential Internal Unrestricted

Removable Storage Prohibited Prohibited unless No control required


encrypted

Correctly applying the right level of data classification can be complex in real-life situations and
may sometimes overwhelm end users. Therefore, once a policy or standard has been created
that defines the required levels of data classification, it remains important to also guide end
users on how to bring this framework to life in their daily work. This is where data classification
handling rules or guidelines come in.

Data classification handling guidelines will help end users with specific guidance on how to
handle each level of data appropriately, for different storage media throughout their lifecycle.
These guidelines help end users to correctly apply rules in practice, for instance when sharing
documents, sending emails, or collaborating across different platforms and organizations.
Microsoft customers indicate that approximately 50% of an Information Protection project is
business focused rather than technical, so end-user training and communication is critical to
success.

Pain Points in Creating a Data Classification


Framework
Data Classification efforts are by nature wide-reaching, touching nearly every business function
within an enterprise. Because of this broad scope and the complexity of managing content in
modern digital environments, companies often face challenges in knowing where to start, how
to manage a successful implementation, and how to measure their progress. Common pain
points include:

5
● Designing a robust and easy-to-understand Data Classification framework, including
determining classification levels and associated security controls
● Developing an implementation plan that includes confirming the appropriate technology
solution, aligning the plan to existing business processes, and identifying impact to the
workforce
● Setting up a Data Classification framework within technology solution and addressing
any gaps between the technology capabilities of the tool and the framework itself
● Establishing a governance structure that oversees the on-going maintenance and health
of Data Classification efforts
● Identifying specific KPIs to monitor and measure progress
● Increasing awareness and understanding of Data Classification policies, why they are
important, and how to comply with them
● Complying with internal audit reviews that target data loss and cybersecurity controls
● Training and engaging end users so that they become mindful of the need for correct
classification in their daily work and apply the right measures accordingly

Creating a Well-Designed Data Classification


Framework
As you develop, revamp, or refine your Data Classification framework, consider the following
leading practices:

● Don’t expect to go from 0-100 on day 1 - Microsoft recommends a crawl-walk-run


approach, prioritizing features critical to the organization and mapping them against a
timeline. Complete the first step, ensure it was successful, and then move on to the next
phase applying lessons learned. Remember that your organization may still be exposed
to risk while you design your Data Classification framework, so it’s ok to start small with
just a few classification levels and expand later as needed.

● You’re not just writing for cybersecurity professionals - Data Classifications frameworks are
meant for a broad audience, including your average staff member, your legal and
compliance team, and your IT team. This means that it's imperative to write clear, easy-
to-understand definitions for your Data Classification levels, providing real-world
examples wherever possible. Also try to avoid jargon and consider a glossary for
acronyms and highly technical terms. For example, use “Personally Identifiable
Information” and provide a definition instead of simply saying “PII.”

● Data Classification frameworks are meant to be implemented - In order for Data


Classification frameworks to be successful, they must be implemented. This is especially

6
relevant when crafting the control requirements for each Data Classification level. Make
sure requirements are clearly spelled out, anticipating and addressing any ambiguity that
might arise during implementation. For example, if you have a control around Personally
Identifiable Information, make sure to spell out exactly what that means, such as Social
Security or Passport Number.

● Only go granular if you need to - As mentioned above, Data Classification frameworks


typically contain anywhere from 3-5 Data Classification levels. But just because you can
include 5 levels doesn’t mean you should. Consider the following criteria when deciding
on the number of classification levels you need:
○ Your industry and your associated regulatory obligations (highly regulated
industries tend to need more classification levels)
○ The operational overhead required to maintain a more complex framework
○ Your users and their ability to comply with the increased complexity and nuance
associated with more classification levels
○ User experience and accessibility when seeking to apply manual classification
across multiple device types

● Get the right people involved – Having a senior stakeholder is critical for success, as many
projects struggle to start or take significantly longer without senior management
backing. Data Classification frameworks are typically owned by Cybersecurity, but they
have legal, compliance, privacy, and change management implications. In order to
ensure you’re creating a framework that truly protects your business, be sure to include
privacy and legal stakeholders such as your Chief Privacy Officer and the Office of
General Counsel in the development of your policy. If your organization has a
Compliance division, Information Governance professionals, or a Records Management
team, they will have valuable things to add as well. As your framework is rolled out to the
business, your Communications department also has a key role to play from a messaging
and adoption perspective.

● Balance security against convenience – A common mistake is to draft a very secure but
also restrictive data classification framework that has been designed with security in
mind, but is very difficult to implement in practice. If end users need to follow complex,
rigid and time-consuming procedures to apply the framework in their daily lives there
always is a risk that the end users no longer believe in its value and stop following
procedures. This risk exists at all levels of the organization including C-suite. A good
balance of security against convenience alongside easy-to-use tools will lead to wider
end-user support. Also, if there are gaps in your framework, don’t wait until everything
is perfect to start implementation. Instead, assess the risk or gap, form a plan to mitigate,

7
and continue moving forward. Remember that Information Protection is a journey – it’s
not something that is activated overnight and then done. Plan, implement some
capabilities, confirm success, and iterate to the next milestone as tools evolve and users
gain maturity and experience.

Also keep in mind that a Data Classification framework (sometimes called a “Data Classification
Policy”) only addresses what your organization should do in order to protect sensitive data. Data
Classification frameworks are often accompanied by data handling rules or guidelines which
defines how to put these policies in place from a technical and technology perspective. In the
following sections, we turn to some practical guidance on how to take your Data Classification
framework from a policy document to a fully implemented and actionable initiative.

Implementing Your Data Classification Framework in


Microsoft 365
Once you’ve developed your Data Classification framework, your next step is implementation.
While Data Classification frameworks often cast a wide net and have implications for almost all
of an enterprise’s IT applications, the following sections will focus on managing content within
Microsoft 365. In this context, content is largely comprised of unstructured data such as emails,
documents, and spreadsheets.

Before we dive into the details, here are a few basics you should know about how Microsoft 365,
and specifically Office 365 Security & Compliance Center, enable the Data Classification
principles discussed above:

● Microsoft 365 is a cloud service which brings together Office 365 productivity software,
device management, and security tools.
● The Office 365 Security & Compliance Center provides access to data and tools for
managing enterprise compliance.
● Within the Office 365 Security & Compliance Center, you can use Sensitivity Labels to
classify and help protect your sensitive content.
○ What do Sensitivity Labels do? Sensitivity Labels can enforce a variety of
protections, including encryption; data loss prevention; and content marking such
as headers, footers and watermarks.
○ Who are Sensitivity Labels for? Sensitivity Labels can be published to specific
user audiences through label policies, as well as establish a default label for those
audiences.

8
○ How are Sensitivity Labels applied? They can be applied manually; by default,
based on policy settings; or automatically, as the result of a condition such as
identified PII.

Mapping Data Classification Levels to Microsoft 365


Sensitivity Labels
For smaller organizations or organizations with a relatively streamlined Data Classification
framework, creating a single Sensitivity Label for each of your Data Classification levels may
suffice.

Excerpt: Example of One-to-One Data Classification Level to Sensitivity Label Mapping

Classification Sensitivity Label Label Settings Published To


Level

Unrestricted Unrestricted ● Apply “Unrestricted” All users


footer

General General ● Apply “General” All users


footer

Lesson learned: During the Microsoft internal information protection pilot we found difficulties
with the ‘Personal’ label, as users were confused as to whether this meant PII or merely related
to a personal matter. We changed this to ‘non-business’ to be clearer. This just goes to show that
taxonomy doesn’t need to be perfect from the start. Start with what you think is right, pilot it,
and adjust based on feedback.

For larger organizations with a global reach or more complex information security needs, you
may find this one-to-one relationship between the number of classification levels in your policy
and the number of Sensitivity labels in your Microsoft 365 environment to be a challenge. This is
especially true in global organizations where a given Data Classification level such as
“Restricted” may have a different definition or different set of controls depending on region.

To address this, it is important to align your Data Classification levels with your desired
Sensitivity Label settings. A Sensitivity Label can support a number of settings, including content

9
marking, data loss prevention, and encryption. Conducting a review of each Data Classification
level against available settings will determine if a Data Classification level can be supported with
one corollary Sensitivity Label, or if desired features will require more than one. For instance,
your review may determine a single classification level requires two unique encryption settings
depending on the nature of specific document types.

Sensitivity Labels can also be grouped hierarchically under a parent label, in order to improve
the user experience. Grouping labels logically provides a more intuitive browsing experience for
users than a simple flat list of choices. This can be especially useful in scenarios where a single
Data Classification level translates into multiple Sensitivity Labels. Sub-labels can add nuance in
the way data is handled via encryption. For instance, having a Confidential parent label with sub-
labels for FTE-only information and data accessible by Partners as well.

Finally, when you publish a Sensitivity Label, keep in mind that you can choose to publish it to all
users, or to specific email-enabled security groups, distribution groups, Office 365 groups, or
dynamic distribution groups. In some cases, creating multiple labels for unique audiences can
add value, in the form of localized names, descriptions, or settings.

Determining how labels will be applied to content


Once you’ve created Sensitivity Labels that align to your Data Classification framework, you’ll
now need to consider how those labels will be applied to content. This is where your plan is put
into action and we start seeing real benefits. You have a few options which can be used in
isolation or in tandem depending on your needs:

● Data Classification Based on Default Value - When publishing a label policy, you can
identify a specific label to be applied by default to all content created by users and
groups included in the policy. This label can set a floor of protection, even if no other
action is taken by users or system settings.

● Data Classification Based on Query - Labels can be applied automatically when content
contains specific types of sensitive information, such as Social Security Numbers.
Alternatively, the system can detect sensitive information types and prompt the user to
optionally apply the label.

● Manual Data Classification - Users can apply labels manually to content. While this
approach requires less up-front configuration and empowers users, it also depends on

10
users choosing to classify content by sensitivity. For that reason, this approach requires a
higher level of training and buy-in to be successful.

Example showing available sensitivity labels in Excel, from the Home tab on the Ribbon. In this
example, the applied label displays on the status bar:

Change Management and Training


Organizations today leverage tools such as Microsoft 365 to implement their data classification
framework. The purpose is to try to automate the classification of data and not increase the
burden on your workforce. This does not mean that your organization has no responsibility to
increase the awareness of the need to manage content and protect the organization from the
risks discussed in this paper. The leading practice continues to be to conduct at least awareness
training across the organization as part of the annual training schedule. Our personal
experience shows that putting robust and comprehensive effort into training your end users –
who are the key audience performing this work – increases their “buy-in” to the effort and can
increase adoption and quality. Adding label recommendations and in-app tips can amplify these
efforts. This does not need to be an extensive standalone course, but you can incorporate it into
other regular training such as your information security annual training and then include an
overview of data classification levels and definitions. The main point is that your workforce has
the understanding that although the tool is automating the classifying of data, that does not
eliminate your workforce’s overall responsibility for protecting the data in accordance with your
company policy.

11
In addition, you should consider more in-depth training for IT and information security staff to
reinforce operational readiness. Both the staff that manage the tool and the data classification
framework must be on the same page. This will require you to invest in a more robust training
schedule that may be more often than annually. This investment represents another avenue to
reduce risk to your organization as this staff is responsible for the implementation and therefore
could be a point of failure if not properly trained both on the tool and the policy.

In the event that you need to manually tag content in the tool, developing a group of super-
users that have received more advanced training would be appropriate. These super users
would be engaged for situations where users are required to manually tag documents with data
sensitivity labels and would have deep understanding of your organization’s data classification
framework and regulatory requirements.

Finally, your leadership should prioritize the championing of information security behaviors in
order to reinforce to the workforce the importance of risk management initiatives such as
developing and implementing a robust data classification framework and assigning key leaders
to promote the initiative, sometimes referred to as ambassadors or champions of the change.

Governance and Maintenance


After you’ve developed and implemented your Data Classification framework, ongoing
governance and maintenance will be critical to your success. In addition to tracking how
Sensitivity Labels are used in practice, you’ll need to update your control requirements based on
changes in regulations, cybersecurity leading practices, and the nature of the content you
manage. Governance and maintenance efforts may include:

● Establishing a governance body dedicated to Data Classification or adding a Data


Classification responsibility to the charter of an existing Information Security body
● Defining roles and responsibilities for those overseeing Data Classification
● Establishing KPIs to monitor and measure progress
● Tracking cybersecurity leading practices and regulatory changes
● Developing Standard Operating Procedures that support and enforce a data
classification framework

12
Industry Considerations
While the basic principles for developing a strong data classification framework are universal,
the details of your framework will depend on the nature of your industry and the unique
compliance and security factors your data demands.

For instance, financial services firms may need to consider compliance with a number of
regulatory frameworks depending on the scope of their business and the regions in which they
operate. For securities firms in the US, this means taking into account regulations like FINRA
Rule 4511, which addresses requirements around the security and retention of books and
records; similarly, firms operating in the UK need to consider FCA compliance.

Government agencies face a variety of regulations governing their data, which vary based on
territory and the nature of their work. In the United States, for instance, government agencies
and their agents that access federal tax information (FTI) are subject to IRS 1075, which aims to
minimize the risk of loss, breach, or misuse of federal tax information.

While financial services firms and government agencies are among the most heavily regulated
organizations in the world, most businesses have industry-specific considerations that need to
be taken into account. Examples include:

● Health industry organizations ensuring compliance with HIPAA


● Education institutions, from K-12 schools to universities, managing FERPA compliance
● Drug manufacturers working to comply with GxP guidelines in their country or region
around information security
● Media, retail, and many other companies dealing with GDPR compliance
● Delivery and storage of entertainment, software, and information content dealing with
CDSA
● Energy industry information security complying with NERC CIP Standard

See also:
● Onboarding and Adoption for Microsoft Information Protection (link pending)
● Microsoft Compliance Offerings

13
Checklist for Success:
Gather allies, form your stakeholder group, and obtain executive sponsorship

Create a Glossary for terms used in guidance

Assess your risks and any regulatory requirements

Decide how many data classification levels you need

Write clear classification definitions in plain language

Create Data Handling Rules/Guidelines

Map Data Classification levels to Sensitivity Labels

Train your users, both initially and recurring

Start small, pilot, measure, and iterate (Crawl, Walk, Run)

14
The information contained in this document represents the current view of Microsoft Corporation on the issues
discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should
not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of
any information presented after the date of publication.

This white paper is for informational purposes only. Microsoft makes no warranties, express or implied, in this
document.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for
any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights
covering subject matter in this document. Except as expressly provided in any written license agreement from
Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights,
or other intellectual property.

© 2020 Microsoft Corporation. All rights reserved.

Microsoft, list Microsoft trademarks used in your white paper alphabetically are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries.

15

You might also like