Data Classification & Sensitivity Label Taxonomy
Data Classification & Sensitivity Label Taxonomy
2
Abstract
This white paper is designed to help business leaders guide their organization through the
process of creating or updating their Data Classification framework in the context of online
services, including how to secure data in Microsoft 365 using Sensitivity Labels. Best practices,
common pain points, and industry considerations are discussed, as well as links to helpful
technical content that will be useful during implementation.
Introduction
Sensitive data presents significant risk to a company if it is stolen, inadvertently shared, or
exposed through a breach. Risk factors include reputational damage, financial impact, and loss
of competitive advantage. Undoubtedly, protecting the data and information your business
manages is a top priority for your organization, but you may find it difficult to know if your
current efforts are truly effective, given the sheer amount of content held by your enterprise.
In addition to volume, your content ranges in importance from highly sensitive and impactful to
trivial and transient, and it can be under the purview of various regulatory compliance
requirements. Knowing what to prioritize and where to apply controls can be a challenge. Read
on to learn about Data Classification, an important tool at your disposal for protecting your
content from theft, sabotage, or inadvertent destruction, and how Microsoft 365 can help
translate your information security goals into reality.
3
Confidential. Other level name variations you may encounter include Restricted, Unrestricted,
and Consumer Protected. Microsoft recommends label names that are self-descriptive and that
highlight their relative sensitivity clearly. For instance, Confidential and Restricted may leave
users guessing which is appropriate, while Confidential and Highly Confidential are more clear
on which is more sensitive.
Lesson learned: Microsoft’s corporate data classification framework originally used a category
and label named ‘Internal’ during pilot phase but found that there were legitimate reasons for a
document to be shared externally and shifted to using ‘General’.
Another important component of a Data Classification framework is the controls associated with
each level. It is important to note that Data Classification levels by themselves are simply labels
(or tags) that indicate the value or sensitivity of the content. In order to actually protect that
content, Data Classification frameworks define the controls that should be in place for each of
your data classification levels. These controls may include requirements related to:
4
Your security controls will vary by data classification level, such that the protective measures
defined in your framework increase commensurate with the sensitivity of your content. For
example, your data storage control requirements will vary depending upon the media that is
being used as well as upon the classification level applied to a given piece of content.
Correctly applying the right level of data classification can be complex in real-life situations and
may sometimes overwhelm end users. Therefore, once a policy or standard has been created
that defines the required levels of data classification, it remains important to also guide end
users on how to bring this framework to life in their daily work. This is where data classification
handling rules or guidelines come in.
Data classification handling guidelines will help end users with specific guidance on how to
handle each level of data appropriately, for different storage media throughout their lifecycle.
These guidelines help end users to correctly apply rules in practice, for instance when sharing
documents, sending emails, or collaborating across different platforms and organizations.
Microsoft customers indicate that approximately 50% of an Information Protection project is
business focused rather than technical, so end-user training and communication is critical to
success.
5
● Designing a robust and easy-to-understand Data Classification framework, including
determining classification levels and associated security controls
● Developing an implementation plan that includes confirming the appropriate technology
solution, aligning the plan to existing business processes, and identifying impact to the
workforce
● Setting up a Data Classification framework within technology solution and addressing
any gaps between the technology capabilities of the tool and the framework itself
● Establishing a governance structure that oversees the on-going maintenance and health
of Data Classification efforts
● Identifying specific KPIs to monitor and measure progress
● Increasing awareness and understanding of Data Classification policies, why they are
important, and how to comply with them
● Complying with internal audit reviews that target data loss and cybersecurity controls
● Training and engaging end users so that they become mindful of the need for correct
classification in their daily work and apply the right measures accordingly
● You’re not just writing for cybersecurity professionals - Data Classifications frameworks are
meant for a broad audience, including your average staff member, your legal and
compliance team, and your IT team. This means that it's imperative to write clear, easy-
to-understand definitions for your Data Classification levels, providing real-world
examples wherever possible. Also try to avoid jargon and consider a glossary for
acronyms and highly technical terms. For example, use “Personally Identifiable
Information” and provide a definition instead of simply saying “PII.”
6
relevant when crafting the control requirements for each Data Classification level. Make
sure requirements are clearly spelled out, anticipating and addressing any ambiguity that
might arise during implementation. For example, if you have a control around Personally
Identifiable Information, make sure to spell out exactly what that means, such as Social
Security or Passport Number.
● Get the right people involved – Having a senior stakeholder is critical for success, as many
projects struggle to start or take significantly longer without senior management
backing. Data Classification frameworks are typically owned by Cybersecurity, but they
have legal, compliance, privacy, and change management implications. In order to
ensure you’re creating a framework that truly protects your business, be sure to include
privacy and legal stakeholders such as your Chief Privacy Officer and the Office of
General Counsel in the development of your policy. If your organization has a
Compliance division, Information Governance professionals, or a Records Management
team, they will have valuable things to add as well. As your framework is rolled out to the
business, your Communications department also has a key role to play from a messaging
and adoption perspective.
● Balance security against convenience – A common mistake is to draft a very secure but
also restrictive data classification framework that has been designed with security in
mind, but is very difficult to implement in practice. If end users need to follow complex,
rigid and time-consuming procedures to apply the framework in their daily lives there
always is a risk that the end users no longer believe in its value and stop following
procedures. This risk exists at all levels of the organization including C-suite. A good
balance of security against convenience alongside easy-to-use tools will lead to wider
end-user support. Also, if there are gaps in your framework, don’t wait until everything
is perfect to start implementation. Instead, assess the risk or gap, form a plan to mitigate,
7
and continue moving forward. Remember that Information Protection is a journey – it’s
not something that is activated overnight and then done. Plan, implement some
capabilities, confirm success, and iterate to the next milestone as tools evolve and users
gain maturity and experience.
Also keep in mind that a Data Classification framework (sometimes called a “Data Classification
Policy”) only addresses what your organization should do in order to protect sensitive data. Data
Classification frameworks are often accompanied by data handling rules or guidelines which
defines how to put these policies in place from a technical and technology perspective. In the
following sections, we turn to some practical guidance on how to take your Data Classification
framework from a policy document to a fully implemented and actionable initiative.
Before we dive into the details, here are a few basics you should know about how Microsoft 365,
and specifically Office 365 Security & Compliance Center, enable the Data Classification
principles discussed above:
● Microsoft 365 is a cloud service which brings together Office 365 productivity software,
device management, and security tools.
● The Office 365 Security & Compliance Center provides access to data and tools for
managing enterprise compliance.
● Within the Office 365 Security & Compliance Center, you can use Sensitivity Labels to
classify and help protect your sensitive content.
○ What do Sensitivity Labels do? Sensitivity Labels can enforce a variety of
protections, including encryption; data loss prevention; and content marking such
as headers, footers and watermarks.
○ Who are Sensitivity Labels for? Sensitivity Labels can be published to specific
user audiences through label policies, as well as establish a default label for those
audiences.
8
○ How are Sensitivity Labels applied? They can be applied manually; by default,
based on policy settings; or automatically, as the result of a condition such as
identified PII.
Lesson learned: During the Microsoft internal information protection pilot we found difficulties
with the ‘Personal’ label, as users were confused as to whether this meant PII or merely related
to a personal matter. We changed this to ‘non-business’ to be clearer. This just goes to show that
taxonomy doesn’t need to be perfect from the start. Start with what you think is right, pilot it,
and adjust based on feedback.
For larger organizations with a global reach or more complex information security needs, you
may find this one-to-one relationship between the number of classification levels in your policy
and the number of Sensitivity labels in your Microsoft 365 environment to be a challenge. This is
especially true in global organizations where a given Data Classification level such as
“Restricted” may have a different definition or different set of controls depending on region.
To address this, it is important to align your Data Classification levels with your desired
Sensitivity Label settings. A Sensitivity Label can support a number of settings, including content
9
marking, data loss prevention, and encryption. Conducting a review of each Data Classification
level against available settings will determine if a Data Classification level can be supported with
one corollary Sensitivity Label, or if desired features will require more than one. For instance,
your review may determine a single classification level requires two unique encryption settings
depending on the nature of specific document types.
Sensitivity Labels can also be grouped hierarchically under a parent label, in order to improve
the user experience. Grouping labels logically provides a more intuitive browsing experience for
users than a simple flat list of choices. This can be especially useful in scenarios where a single
Data Classification level translates into multiple Sensitivity Labels. Sub-labels can add nuance in
the way data is handled via encryption. For instance, having a Confidential parent label with sub-
labels for FTE-only information and data accessible by Partners as well.
Finally, when you publish a Sensitivity Label, keep in mind that you can choose to publish it to all
users, or to specific email-enabled security groups, distribution groups, Office 365 groups, or
dynamic distribution groups. In some cases, creating multiple labels for unique audiences can
add value, in the form of localized names, descriptions, or settings.
● Data Classification Based on Default Value - When publishing a label policy, you can
identify a specific label to be applied by default to all content created by users and
groups included in the policy. This label can set a floor of protection, even if no other
action is taken by users or system settings.
● Data Classification Based on Query - Labels can be applied automatically when content
contains specific types of sensitive information, such as Social Security Numbers.
Alternatively, the system can detect sensitive information types and prompt the user to
optionally apply the label.
● Manual Data Classification - Users can apply labels manually to content. While this
approach requires less up-front configuration and empowers users, it also depends on
10
users choosing to classify content by sensitivity. For that reason, this approach requires a
higher level of training and buy-in to be successful.
Example showing available sensitivity labels in Excel, from the Home tab on the Ribbon. In this
example, the applied label displays on the status bar:
11
In addition, you should consider more in-depth training for IT and information security staff to
reinforce operational readiness. Both the staff that manage the tool and the data classification
framework must be on the same page. This will require you to invest in a more robust training
schedule that may be more often than annually. This investment represents another avenue to
reduce risk to your organization as this staff is responsible for the implementation and therefore
could be a point of failure if not properly trained both on the tool and the policy.
In the event that you need to manually tag content in the tool, developing a group of super-
users that have received more advanced training would be appropriate. These super users
would be engaged for situations where users are required to manually tag documents with data
sensitivity labels and would have deep understanding of your organization’s data classification
framework and regulatory requirements.
Finally, your leadership should prioritize the championing of information security behaviors in
order to reinforce to the workforce the importance of risk management initiatives such as
developing and implementing a robust data classification framework and assigning key leaders
to promote the initiative, sometimes referred to as ambassadors or champions of the change.
12
Industry Considerations
While the basic principles for developing a strong data classification framework are universal,
the details of your framework will depend on the nature of your industry and the unique
compliance and security factors your data demands.
For instance, financial services firms may need to consider compliance with a number of
regulatory frameworks depending on the scope of their business and the regions in which they
operate. For securities firms in the US, this means taking into account regulations like FINRA
Rule 4511, which addresses requirements around the security and retention of books and
records; similarly, firms operating in the UK need to consider FCA compliance.
Government agencies face a variety of regulations governing their data, which vary based on
territory and the nature of their work. In the United States, for instance, government agencies
and their agents that access federal tax information (FTI) are subject to IRS 1075, which aims to
minimize the risk of loss, breach, or misuse of federal tax information.
While financial services firms and government agencies are among the most heavily regulated
organizations in the world, most businesses have industry-specific considerations that need to
be taken into account. Examples include:
See also:
● Onboarding and Adoption for Microsoft Information Protection (link pending)
● Microsoft Compliance Offerings
13
Checklist for Success:
Gather allies, form your stakeholder group, and obtain executive sponsorship
14
The information contained in this document represents the current view of Microsoft Corporation on the issues
discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should
not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of
any information presented after the date of publication.
This white paper is for informational purposes only. Microsoft makes no warranties, express or implied, in this
document.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under
copyright, no part of this document may be reproduced, stored in, or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for
any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights
covering subject matter in this document. Except as expressly provided in any written license agreement from
Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights,
or other intellectual property.
Microsoft, list Microsoft trademarks used in your white paper alphabetically are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries.
15